AD-A204  480 


V\Vt 


89-013] 


DTIC  ' 


VISUAL  REPRESENTATIONS  OF  TEXTURE 


Final  Report  AFOSR  Grant  85-0359 
(September  1,  1985  -  November  30,  1988) 


Part  I 


Jacob  Beck 

Depairtment  of  Psychology 
University  of  Oregon 
Eugene,  OR  97403 


Part  II 


Kent  A.  Stevens 


Department  of  Computer  and  Information  Sciences 
University  of  Oregon 
Eugene,  Or  97403 


PtSTHTSUnON  STATEMENT  K 

Approved  ioi  public  teieoMl 
^  Difltributioo  Uzilinuted 


O  1 


m 


IFICATION  OF  THIS 


REPORT  DOCUMENTATION  PAGE 


la.  REPORT  SECURITY  CLASSIFICATION 


,^SECURITJ'  CLASSIFICATION  AUTHORITY 


2b.  DECLASSIFICATION /DOWNGRADING  SCHEDULE 


4.  PERFORMING  ORGANIZATION  REPORT  NUMBER(S) 


1b.  RESTRICTIVE  MARKINGS 


3.  DISTRIBUTION/ AVAILABILITY  OF  REPORT 


5.  MONITORING  ORGANIZATION  REPORT  NUMBER(S) 

APOSR-TR-  8  9-0131 


6a.  NAME  OF  PERFORMING  ORGANIZATION 

Department  of  Psychology 
University  of  Oregon 


6c  ADDRESS  (Gty,  State,  and  ZIP  Code) 

Eugene,  OR  97403-0237 


8a.  NAME  OF  FUNDING /SPONSORING 
ORGANIZATION 


8c  ADDRESS  fC/fy,  State,  and  ZIP  Code) 

Building  410 

Bolling  AFB  DC  20332-6448 


11.  TITLE  (Include  Security  Classification) 

Visual  Representations  of  Texture 


6b.  OFFICE  SYMBOL  7a.  NAME  OF  MONITORING  ORGANIZATION 

RFCCftiML 


9.  PROCUREMENJ  INSTRUMENT  IDENTIFICATION  NUMBER 


AFOSR-85-0359 


10.  SOURCE  OF  FUNDING  NUMBERS 


TASK 

WORK  UNIT 

NO. 

ACCESSION  NO 

12.  PERSONAL  AUTHOR(S) 

Jacob  Beck.  Kent  A. 

13a.  TYPE  OF  REPORT 

inal  _ 


16.  SUPPLEMENTARY  NOTATION 


itevens _ 

13b.  TIME  COVERED 
FROM'" 


UNCLASSIFIED 


jl4.  DATE  OF  REPORT  (Year,  Month,  Day)  IS.  PAGE  COUNT 

B  December  15,  1988  117 


r  ; 


COSATI  COOES 


FIELD  I  GROUP  |  SUB-GROUP 


18  SUBJECT  TERMS  (Continue  on  reverse  if  necessary  and  identifyiby  block  number) 


Vision,  texture  perception,  texture  segmentation 


19.  ABSTRACT  (Continue  on  reverse  if  necessary  and  identify  by  block  number) 

Ou-r  research  is  concerned  with  understanding  both  the  computational  and  rveurophsiol ogical 
bases  of  texture  segregation.  During  the  grant  period  we  have  (a)  conducted  a  series 
of  experiments  investigating  the  interaction  of  size  and  contrast  in  texture  segregation, 
(b)  compared  our  experimental  results  with  the  calculated  outputs  of  a  flTT  Gabor  model  of 
simple-cell-like  spatial -frequency  channels,  (c)  established  that  the  function  describing 
perceived  segregation  of  a  texture  resulting  from  lightness  differences  of  the  texture 
elements  is  not  the  same  as  the  function  describing  the  perceived  lightness  differences  of 
the  elements.  We  also  showed  that  the  outputs  of  spatial  frequency  channels  that  predict 
the  perceived  segregation  of  texture  regions  fail  to  predict  the  perceived  salience  of  a 
line  composed  of  disconnected  shapes  embedded  in  a  background  of  the  same  shapes. 

The  second  part  of  the  report  describes  work  by  Stevens  on  the  earliest  levels  in  the 
extraction  of  geometric  structure.  The  work  has  involved  a  computational  and  psychophysical 
study  of  the  role  of  retinal  and  cortical  spatial  frequency  filters  in  the  extraction  (over] 


20.  DISTRIBUTION /AVAILABILITY  OF  ABSTRACT 
CO  UNCLASSIFIED/UNLIMITED  □  SAME  AS  RPT 


DD  FORM  1473,84  MAR 


ACT  21.  ABSTRACT  SECURITY  CLASSIFICATION 

AS  RPT  Ddtic  USERS  Unclassified 


22b.  TELEPHONE  (Inc/ude  Area  Code)  22c.  OFFICE  SYMBOL 

(202)  767-5021  AFOSR/NL 


83  APR  edition  may  be  used  until  exhausted.  SECURITY  CLASSIFICATION  OF  THIS  PAGE 

All  other  editions  are  obsolete 


'2  1  DFH  1989 


19.  Abstract  (cont) 


of  contour  information.  The  specific  areas  reported  concern:  i)  the  differential  roles 
of  radially-syiiroetric  and  elongated  receptive  fileds  on  the  Cafe  wall  illusion,  a 
pattern  that  is  useful  for  the  induction  of  illusory  brightness  bands  and  orientation, 
ii)  a  strategy  for  parsing  of  band-pass  filtered  images  to  differentiate  line-like  versus 
edge-like  luminance  changes,  iii)  asserting  orientation  between  discrete  items,  and 
iv)  connecting  contour  fragments  across  liminance  gaps.  Across  these  areas  one  common 
theme  is  the  importance  of  spatial  gating  nonlinearity. 


2 


CONTENTS 
Part  I 


1 .  Abstract 


3 


2.  Research  Siunnary  4 

2 . 1  Introduction  4 

2.2  Spatial-frequency  channels  in  texture  segregation  6 

2.2.1  How  spatial-frequency  channels  might  explain 

the  area  x  contrast  tradeoff  6 

2.2.2  Simple  spatial-frequency  channels  model  7 

2.3  Experiments  compared  to  predictions  of  simple 

model  10 

2.3.1  Area-ratios  and  background  luminance  10 

2.3.2  Line  vs.  square  elements  12 

2.3.3  Pattern  density  (duty-cycle)  13 

2.3.4  Varying  fundamental  frequency  (scaling)  14 

2.3.5  Squares,  circles,  blobs,  and  aligned 

vs.  nonaligned  squares  17 

2.4  Patterns  with  no  energy  at  the  fundamental: 

results  and  predictions  of  simple  model  18 

2.5  Patterns  with  same-  and  opposite-sign-of -contrast: 

results  and  predictions  of  simple  model  19 

2.6  Complex  spatial-frequency  channels  model  21 

2.7  Comparison  of  perceived  segregation  and  perceived 

lightness  24 

2.8  Global  popout  26 

2.8.1  Comparison  of  texture  segregation  and  the 

popout  of  lines  27 

2.8.2  Popout  experiments  27 

2.8.3  Difference  between  texture  segregation  and 

line  detection  28 


3.  Publications 

4 .  Professional  Personnel 

5 .  Meetings 

6 .  References 


Accesion  For  ^ 

NTIS  CRA&I 

d 

DTIC  TAB 

□ 

Undii:in\i-.';ed 

□ 

Justilic 

t -  - - - 

)ibt:  ib'itit  n  I 


*v;iil jb'lity  Codes 


Avjil  n’id/or 
Spt'C'dl 


30 

31 
31 
33 


3 


1.  ABSTRACT 

_ '  )  ■  "v 

Our  research  is  concerned  with  understanding  both  the 
computational  and  neurophysiological  bases  of  texture  segregation. 
During  the  grant  period  we  have  (a)  conducted  a  series  of 
experiments  investigating  the  interaction  of  size  and  contrast  in 
texture  segregation,  (b)  compared  our  experimental  results  with  the 
calculated  outputs  of  a  2D  GeOjor  model  of  simple-cell-like 
spatial-frec[uency  channels,  (c)  estedslished  that  the  function 
describing  perceived  segregation  of  a  texture  resulting  from 
lightness  differences  of  the  texture  elements  is  not  the  same  as 
the  function  describing  the  perceived  lightness  differences  of  the 
elements.  We  also  showed  that  the  outputs  of  spatial  frequency 
channels  that  predict  the  perceived  segregation  of  texture  regions 
fail  to  predict  the  perceived  salience  of  a  line  composed  of 
disconnected  shapes  embedded  in  a  background  of  the  same  shapes. 

The  trade-off  between  size  and  contrast  follows  from  the 
hypothesis  that  strong  preattentive  texture  segregation  occurs 
when  spatial-frequency  channels  yield  a  large  difference  in  mean 
or  modulated  activity  to  two  textures.  Though  overall  the  results 
of  our  experiments  were  consistent  with  the  hypothesis  that  texture 
segregation  occurs  as  a  result  of  the  differential  stimulation  of 
spatial-frequency  channels,  aspects  of  the  results  from  experiments 
in  which  the  fundamental  frequency  of  the  texture  was  varied, 
textures  containing  elements  of  opposite  contrast-sign,  and 
textures  containing  balanced  elements  with  no  energy  at  their 
fundeunental  frequency  were  not  consistent  with  the  hypothesis. 
These  discrepancies  suggest  that  our  model  was  not  making 
sufficient  use  of  information  in  the  higher  harmonic  channels.  One 
way  in  which  the  information  in  the  higher  harmonics  may  be  used 
involves  a  more  complicated  spatial-frequency  channels  model.  In 
this  model ,  each  channel  contains  in  addition  to  an  initial 
linear- filtering,  a  nonlinear  rectification  followed  by  a  second 
linear- filtering.  Both  filterings  are  selective  for 
spatial-frequency  and  orientation.  We  call  these  "channels 
complex"  to  both  distinguish  it  from  the  simple  channels  and 
because  this  kind  of  channel  is  similar  to  current  models  of 
complex  cells.  The  complex  channel  model  taking  into  consideration 
the  effects  of  light  adaptation  is  able  to  account  qualitatively 
for  all  our  discrepancies. 

A  striking  finding  is  that  we  have  been  able  to  construct 
texture  displays  composed  of  elements  differing  clearly  in 
lightness  which  fail  to  yield  perceived  segregation.  Equal 
lightness  differences  can  lead  to  markedly  different  degrees  of 
perceived  segregation  depending  on  the  ratio  of  the  background 
luminance  to  the  high-luminance  square.  The  fxinctions  describing 
the  perceived  lightness  differences  of  the  squares  and  the 
perceived  segregation  of  the  texture  displays  are  not  the  seune 
functions.  Our  experiments  show  that  segregation  ratings  are  a 
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fvmction  of  the  ratio  of  the  contrasts  of  the  high-  to  low-contrast 
sc[uares.  Contrast  is  defined  as  the  difference  between  the 
luminance  of  the  square  and  the  luminance  of  the  background, 
divided  by  the  luminance  of  the  background.  The  relevant  variable 
for  texture  segregation  is  therefore  the  luminance  increment  ratio 
of  the  target  to  the  background.  For  perceived  lightness,  the 
relevant  variable  is  instead  the  luminance  ratio  of  the  target  to 
the  background. 

In  a  display  composed  of  disconnected  shapes,  a  line  may 
sometimes  popout  even  though  the  shapes  in  the  line  do  not  differ 
from  the  other  shapes  in  the  display  and  do  not  occupy  a  disjoint 
region.  The  spacing  between  the  shapes  in  the  line  and  in  the 
background  are  also  similar.  The  difference  is  that  the  shapes  in 
the  line  are  approximately  aligned  and  there  is  a  greater  density 
of  shapes  in  the  direction  of  the  line.  In  our  texture  patterns, 
the  arrangement  of  local  properties  is  different  in  different 
regions  so  that  if  the  display  is  suitably  filtered  by  convolving 
the  appropriate  property  at  each  point,  or  by  performing  some 
equivalent  filtering  process  in  the  Fourier  domain,  the  regions  in 
the  filtered  display  differ  in  different  regions.  This  type  of 
computation  does  not  appear  to  be  able  to  account  for  the  global 
popout  of  lines  of  disconnected  shapes  in  which  the  line  does  not 
occupy  a  disjoint  region  of  the  display. 

2.  RESEARCH  SUMMARY 

2 . 1  Introduction 

Models  of  texture  segregation  fall  into  three  classes.  In  one 
class  of  models,  texture  segregation  is  based  on  the  geometric 
features  of  a  texture  pattern.  Beck  (1972,  1982}  and  Marr  (1976) 
proposed  that  texture  segregation  is  based  on  differences  in 
first-order  statistics  of  simple  features  of  a  texture  pattern  such 
as  the  slopes  and  sizes  of  textvire  elements  or  of  their  component 
parts.  Julesz  (1981)  has  ladseled  such  features  textons  and 
proposed  that  there  are  three  kinds:  elongated  blobs  (e.g.  line 
segments) ,  terminators  (e.g.  line  terminations) ,  and  intersections 
(e.g.  crossings  of  line  segments).  In  a  second  class  of  models, 
the  primitives  for  texture  segregation  are  not  geometric  features 
but  the  outputs  of  receptive- field-1 ike  operators.  In  many  of  the 
models  in  this  class,  texture  segregation  is  based  on  differences 
in  image  statistics  following  convolution  of  a  texture  with 
local,  linear  filters  that  have  weighting  functions  like  the 
receptive  fields  of  simple  cells.  A  number  of  investigators  have 
proposed  that  texture  segregation  is,  at  least  sometimes,  directly 
based  on  differences  in  the  outputs  of  spatial-frequency  channels 
(e.g.  Beck,  Sutter,  &  Ivry,  1987;  Caelli,  1985;  Daugman,  1987; 
Ginsburg,  1984;  Grediam  1981;  Grossberg  &  Mingolla,  1985;  Turner, 
1986) .  Spatial -frequency  channels  are  quasi -independent,  parallel 
channels  composed  of  local  receptive  fields  that  are  distributed 
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throughout  the  visual  field  and  are  alike  in  their  sensitivity  to 
spatial  frequency  and  orientation.  The  evidence  that  the  visual 
system  analyzes  a  stimulus  into  a  set  of  spatial-frequency 
channels,  and  their  usefulness  in  visual  modeling  has  been  reviewed 
elsewhere  (e.g.,  GreOiam  1980,  1981,  1985). 

In  a  third  class  of  models,  texture  segregation  is  based  on 
differences  in  second-order  statistics  of  the  luminances  at 
different  points  in  the  texture.  The  second-order  statistics  of 
a  region  are  based  on  the  joint  probed>ility  distribution  that  a 
pair  of  points  separated  by  a  given  distance  and  orientation  have 
particular  gray  levels.  Julesz's  original  conjecture  (1975) 
considered  two  gray  levels  and  the  extension  of  this  conjecture 
to  patterns  containing  many  gray  levels  must  be  done  carefully 
(Klein  and  Tyler,  1986) .  Julesz  (1975)  conjectured  that  textures 
with  the  seune  global  second-order  statistics  do  not  segregate,  but 
counter-examples  to  Julesz's  conjecture  have  been  found  (Julesz, 
Gilbert  &  Victor,  1978;  Victor  &  Brodie  1978).  Gagalowicz  (1981) 
pointed  out  that  the  counter-exzuaples  involve  patterns  in  which 
local  second-order  statistics  are  not  the  same  throughout  the 
pattern  and  differ  from  the  global  second-order  statistics.  He 
hypothesized  that  textures  which  have  the  s2nBe  local  second-order 
statistics  throughout  will  fail  to  segregate.  It  should  be  noted 
that  if  texture  segregation  is  a  function  of  only  the  amplitudes 
and  not  the  phases  of  the  spatial  frequencies,  then  the 
spatial-frequency  channel  approach  is  closely  related  to  models 
based  on  second-order  statistics. 

Beck,  Sutter,  and  Ivry  (1987)  investigated  texture  segregation 
in  a  three-part  (tripartite)  pattern  in  which  each  part  contains 
approximately  equal  numbers  of  two  different  elements  (large  and 
small  squares  in  Figure  1) .  The  textures  to  be  segregated  differed 
in  the  arrangement  of  the  two  elements.  In  the  top  and  bottom 
regions,  the  two  elements  were  arranged  in  vertical  stripes.  In 
the  center  region,  the  elements  were  arranged  in  a  checked  pattern. 
They  reported  four  important  findings:  (1)  Size  and  contrast  are 
not  independent  attributes  but  can  cancel  each  other.  For  example, 
large  and  small  squares  of  equal  contrast  yielded  strong  texture 
segregation.  However,  texture  segregation  was  reduced  greatly  if 
the  contrast  of  the  small  square  was  increased  so  that  the  areal 
contrasts  (area  x  contrast)  of  the  large  and  small  squares  were 
equal.  (2)  A  difference  in  sign  of  contrast  yields  strong  texture 
segregation.  Squares  of  equal  size  whose  luminances  were  above  and 
below  the  background  luminance  yielded  strong  texture  segregation 
even  when  the  ratio  of  background  luminance  to  square  luminance  was 
very  close  to  l.O.  (3)  Texture  segregation  does  not  vary  directly 
with  the  lightness  difference  of  the  squares.  Equal  size  squares 
differing  by  a  large  lightness  difference  failed  to  give  texture 
segregation,  while  the  same  squares  differing  by  a  smaller 
lightness  difference  yielded  strong  texture  segregation  depending 
upon  the  background  luminance.  (4)  Hue  is  a  weak  feature  for 
texture  segregation.  In  the  eUssence  of  size  and  contrast 
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differences,  hue  differences  failed  to  yield  strong  spontaneous 
texture  £:egregation.  Our  research  has  focused  on  investigating  in 
detail  the  first  three  of  the  above  findings. 

2.2  Spatial-frequency  channels  in  texture  segregation 

When  viewing  the  pattern  in  Figure  1,  subjects  reported 
spontaneously  segregating  it  into  two  textures-stripes  in  the  top 
and  bottom  regions  and  a  checked  region  in  the  center  (a  perception 
that  we  call  ''tripartite  segregation'').  The  segregation  of  the 
pattern  into  three  regions  is  not  surprising  and  is  explainable  in 
many  ways.  Of  interest  is  what  happened  when  the  large  and  small 
squares  differed  in  contrast  as  well  as  size.  Figure  2  shows  the 
s2UQe  display  with  the  contrast  of  the  small  squares  increased.  As 
the  contrast  of  the  small  squares  increases  starting  at  a  value 
equal  to  that  of  the  large  squares,  perceived  segregation 
decreases.  After  sufficient  increase  in  contrast,  perceived 
segregation  reaches  a  minimum  and  then  starts  increasing  again. 
[Some  results  are  shown  in  Figs.  7-9  and  Beck,  Sutter,  &  Ivry 
(1987)  .]  The  minimum  tends  to  occur  when  the  areal  contrasts  (area 
X  contrast)  of  the  large  and  small  squares  were  made  equal. 

2.2.1  How  spatial'frequency  channels  might  e^qtlain  the  area  x 
contrast  trade-off 

This  trade-off  between  contrast  and  area  suggested  that 
perceived  texture  segregation  occurs  strongly  when  a 
spatial- frequency  channel  or  chzuinels  yield  a  large  difference  in 
activity  to  the  striped  and  checked  regions  of  the  tripartite 
pattern.  The  tripartite  pattern  is  periodic  repeating  itself  every 
two  columns.  The  left  panel  in  Figure  3  shows  a  pattern  having 
unequal-size  squares  and  equal  contrast;  the  middle  panel  shows  the 
output  of  a  vertically  oriented  spatial-frequency  channel  tuned  to 
the  fundamental  frequency  of  the  striped  region;  and  the  right 
panel  the  output  of  a  higher  spatial -frequency  channel. 
Specifications  of  the  channel  properties  are  given  in  Section 
2.1.2.  When  the  excitatory  region  of  a  receptive  field  timed  to 
the  fundeunental  frequency  of  the  pattern  is  centered  over  a  column 
of  large  squares,  the  receptive  field  is  strongly  stimulated  by  the 
large  squares  and  weakly  inhibited  by  the  small  squares.  When  the 
excitatory  region  is  centered  over  a  column  of  small  squares,  the 
receptive  field  is  weakly  stimulated  by  the  small  squares  and 
strongly  inhibited  by  the  large  scpiares.  Thus,  the  middle  panel 
shows  strongly  modulated  activity  with  high  outputs  at  the  center 
of  the  large-square  columns  and  low  outputs  at  the  center  of  the 
small  square-columns. 

When  the  excitatory  region  of  a  vertically  oriented  receptive 
field  is  centered  over  either  a  large  or  a  small  square  in  the 
checked  region  of  the  pattern,  the  amounts  of  excitation  and 
inhibition  are  approximately  equal.  Thus,  the  output  at  each 
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spatial  position  is  about  the  same,  and  there  is  little  modulated 
activity  (see  middle  region  of  output  in  middle  panel  of  Figure  3) . 
Channels  tuned  to  higher  harmonics  of  the  pattern  (right  panel  of 
Figure  3}  ,  respond  to  the  edges  of  the  squares  in  the  pattern. 
Though  there  is  a  difference  in  the  spatial  arrangement  of  the 
outputs  in  the  striped  and  checked  regions,  there  is  no  difference 
in  the  amount  of  overall  activity. 

Figure  4  is  like  Figure  3  except  that  the  contrast  of  the 
small  squares  in  the  pattern  is  now  4  times  that  of  the  large 
squares  (so  the  areal  contrasts  are  equal)  .  Now  when  the 
excitatory  region  of  a  receptive  field  from  the  channel  txined  to 
the  fundamental  frequency  is  centered  over  either  a  large  or  small 
square  in  either  the  striped  or  checked  regions,  the  output  is 
about  the  S2uae  since  the  greater  contrast  has  balanced  out  the 
smaller  size  of  the  squares.  The  output  of  the  higher-harmonic 
channel  (right  panel)  looks  much  like  that  in  Figure  3. 

We  conjectured  that  spontaneous  strong  texture  segregation 
occurs  only  when  there  are  differences  in  the  mean  or  modulated 
activity  of  a  channel  or  channels  to  the  striped  and  checked 
regions  of  the  tripartite  pattern. 

2.2.2  Simple  spatial-frequency  channels  model 

The  challenge  was  to  find  a  quantitative  model  that  will 
predict  the  observer's  ratings  of  perceived  segregation.  To  do  so 
requires,  at  the  least,  considering  the  responses  of  all  the 
filters.  These  responses  are  displayed  in  summary  form  in  Figure 
5  by  taking  a  period  from  the  middle  of  the  checked  region  on  any 
one  filter's  outputs  (as  in  Figure  3)  and  another  period  from  the 
middle  of  the  striped  region  of  any  one  filter's  outputs  and  then 
displaying  this  pair  of  periods  for  each  of  the  different  filters. 
Figure  5  shows  the  outputs  of  39  filters  resulting  from  the 
combination  of  thirteen  different  spatial  frequencies  (from  left 
to  right  in  13  columns  in  Figure  5)  and  three  different 
orientations-vertical ,  45  degrees  oblique,  and  horizontal  (from 
top  to  bottom  in  three  pairs  of  two  rows  each) . 

Figure  5  shows  responses  to  the  unequal-size  equal-contrast 
pattern  in  Figure  l.  (The  large  square  is  4  times  the  area  of  the 
small  square  and  the  contrasts  of  the  squares  are  equal.)  As 
discussed  in  the  previous  sections,  the  vertically  oriented 
channels  sensitive  to  spatial  frequencies  near  the  fundamental 
frequency  of  the  pattern  (1.0  and  1.41  cycles/deg)  show  strongly 
modulated  activity  in  the  striped  region  and  little  modulated 
activity  in  the  checked  region.  Strong  modulated  activity  occurs 
in  the  checked  region  for  channels  sensitive  to  the  fundamental 
oblique  frequency  (greater  by  the  square  root  of  2  than  the 
fundamental  vertical  frequency- 1.41  and  2.0  cycles/deg)  and  little 
modulated  activity  in  the  striped  region.  For  lower  spatial 
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frequencies  whose  periods  are  larger  than  the  fundamental  period, 
there  is  little  response  to  either  the  checked  or  striped  regions 
by  either  the  vertical  or  oblique  channels.  The  receptive  fields 
are  so  large  that  they  average  over  adjacent  rows  and  columns  of 
squares.  (Some  slight  mean  output  differences  between  checks  and 
stripes  are  visible-especially  in  the  lower  left  of  Figure  5. 
These  are  responses  by  very  large  receptive  fields  to  the  edges  of 
the  tripartite  pattern.  The  differences  are  caused  because  striped 
patterns  were  always  near  the  horizontal  edges  of  the  filtered 
patterns.)  Spatial  frequencies  whose  period  is  smaller  than  the 
fundamental  period  respond  near  the  edges  of  all  the  squares  in 
the  pattern.  Although  the  pattern  of  the  activity  is  distributed 
differently  in  the  striped  and  checked  regions,  the  amounts  of 
modulated  activity  to  the  striped  and  checked  regions  by  the 
vertical  and  diagonal  channels  are  similar. 

Figure  6  shows  the  filtered  outputs  to  the  pattern  in  Figure 
2  in  which  the  products  of  the  areas  x  contrasts  of  the  large  and 
small  squares  are  equal.  (The  large  square  is  4  times  the  area  of 
the  small  square;  the  small  square  is  4  times  the  contrast  of  the 
large  square.)  As  discussed  in  the  previous  subsection,  the 
vertical  1.0  and  1.41  cycles/deg  and  the  oblique  1.41  and  2.0 
cycles/deg  channels  show  no  modulated  response  to  either  the 
checked  or  striped  regions  of  this  pattern.  There  is  no 
information  in  the  amount  of  modulated  activity  for  segregating 
the  tripartite  pattern  into  distinct  regions. 

Characteristics  of  Channels . —Esich  channel  is  assumed  to  be  a  linear, 
translation-invariant  filter.  We  modeled  the  receptive-field 
weighting  functions  by  two-dimensional  Gabor  functions  (as  used, 
for  example,  by  Daugman,  1985;  Watson,  1983) .  The  weighting 
function  in  one  direction  is  a  Gaussian  multiplied  by  a  sinusoid 
and  in  the  perpendicular  direction  a  Gaussian.  The  parameters  of 
the  model  we  have  implemented  follow  Watson  (1983)  in  that  the 
spatial-frequency  half-amplitude  bandwidth  is  one  octave  and  the 
orientation  half -amplitude  bandwidth  is  38  degrees.  In  Figures  5 
and  6,  the  variation  in  sensitivity  with  spatial  frequency  is  not 
represented  but  is  in  the  calculations  of  the  model.  Our  model  is 
less  complete  than  that  of  Watson  in  two  respects:  the  fields  have 
even  symmetry,  and  we  have  not  incorporated  the  decrease  in  acuity 
occurring  with  retinal  eccentricity. 

Combining  the  Outputs  of  Many  Channels  to  Predict  the  Observer’s  Rating.  — We  now 
wanted  to  turn  these  channel  outputs  into  a  quantitative  prediction 
of  the  observer's  ratings  of  perceived  texture  segregation.  Our 
first  attempt  is  a  crude  measure  of  the  degree  to  which  there  are 
gross  difference  between  the  outputs  of  one  or  more  filters  to  the 
two  different  texture  regions.  This  simple  model  can  be  seen  as 
an  elaboration  of  Julesz's  (1972)  original  statistical  conjecture 
and  is  rather  in  the  spirit  of  recent  attempts  by  a  number  of 
investigators  (e.g.,  Caelli,  1982;  Daugman,  1987;  Turner,  1986)  to 
use  spatial- frequency  filters  to  predict  results  from  a  number  of 
early  texture  patterns.  It  differs  from  this  earlier  theoretical 
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work  in  attempting  to  generate  quantitative  predictions  of  the 
strength  of  perceived  texture  segregation  (rather  than  just 
distinguishing  textures  that  would  segregate  from  those  that 
wouldn't  at  all)  so  that  these  quantitative  predictions  can  be 
compared  to  empirical  results  in  a  rigorous  way. 

(i)  Our  simple  model  first  computes  the  output  of  channels  that 
are  linear,  treuislatlon- Invariant  filters  t\ined  to  many 
different  spatial  frequencies  and  orientations  as  described 
eUsove.  Let  the  output  at  position  (x,y)  of  the  channel  tuned 
to  the  i***  frequency  emd  the  /**  orientation  be  called  Oy  (x,y)  . 

(ii)  The  model  then  computes  a  spatially-pooled  response  of  each 
channel  to  the  checked  and  to  the  striped  region;  in 
particular,  the  standard  deviation  of  the  outputs  at 
different  spatial  position  is  computed.  For  example,  the 
spatially  pooled  response  of  the  if^  channel  to  the 
checkerboard  region  is 


ia  checked 
region 


where  Ny  are  the  numbers  of  spatial  positions  in  the  x  and 
y  directions  in  one  period  of  the  pattern  and  the  summing  is 
done  over  the  checked  region.  The  model  then  takes  the 
difference  between  each  filter's  spatially-pooled  response 
to  the  checked  emd  to  the  striped  region  yielding  a 
within-filter  difference  for  the  if^  filter  of 

~  ) ~ ^ij ) 


(ill)  The  model  combines  (pools)  these  within-filter  differences 
cross  many  spatial  frequencies  and  orientations  of  filters, 
weighting  the  differences  according  to  the  observers' 
sensitivity  to  different  orientations  and  spatial  frequency: 
So,  the  predicted  value  equals  a  pooled  weighted  siim  of 
within-filter  differences.  In  particular,  the  predicted 
value  equals  the  following  quantity: 


Nf  No 


•  ; 


where  C/‘j,(7j)  is  the  sensitivity  of  the  observer  to  the 
frequency  and  orientation,  is  the  niimber  of  frequencies 
(13 — from  .5  to  16  cycles/deg  in  square-root-of-two  steps) 
and  is  the  number  of  orientations  (3 — vertical. 
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45-degrees,  and  horizontal) . 

(iv)  Finally  the  model  assiimes  that  the  observer's  ratings  of 
perceived  segregation  are  monotonically  related  to  the 
predicted  value  shown  2d)ove. 

Note.  We  tried  exponents  of  1,  3,  4  in  the  above  equations  (both 
for  spatial-pooling  and  for  pooling  across  channels)  as  well  as 
using  the  maximum,  the  minimum,  or  the  maximum-minus-minimums . 
Conclusions  using  any  of  them  were  identical  to  those  illustrated 
here  with  an  exponent  of  2 .  It  should  also  be  noted  that  the  model 
we  are  investigating  specifies  the  information  for  texture 
segtregation  but  does  not  provide  a  procedure  for  locating  texture 
boundaries  since  it  does  not  address  the  question  of  how  the 
information  is  used  to  segment  a  pattern  into  regions.  This  is  a 
problem  discussed  by  Caelli  (1985),  Grossberg  &  Mingolla  (1985), 
and  Grossberg  (1987). 


2.3  Experiments  compared  to  predictions  of  siBq>le  model 

Methods  and  Procedures. — The  procedxire  and  instructions  were 
similar  to  Beck,  Sutter,  &  Ivry  (1987) .  Except  where  noted  the 
background  Iviminance  was  set  at  16.1  ft.-L  and  appeared  gray.  The 
intensity  of  one  kind  of  texture  element  (e.g.,  the  large  squares 
in  Figures  1  or  2)  was  kept  constant  while  the  Intensity  of  the 
second  kind  of  texture  element  (e.g. ,  the  small  squares  in  Figures 
1  or  2)  was  varied.  When  the  texture  elements  differed  in  size, 
the  constant-luminance  element  was  always  the  larger  element  and 
the  varledjle- luminance  element  the  smaller  element.  Except  where 
noted  the  fundeunental  frequency  of  the  patterns  was  about  1  c/deg 
(the  period  was  56  pixels  and  at  the  viewing  distance  of  6  feet, 
1  pixel  subtends  1.08  minutes).  In  each  experiment,  10  svibjects 
rated  the  perceived  texture  segregation  of  each  pattern  5  times  on 
a  5-point  rating  scale.  Each  pattern  was  shown  for  1  second.  In 
between  patterns  the  screen  was  at  the  background  luminance.  All 
the  patterns  in  a  given  experiment  were  randomly  intermixed. 

The  contrast  of  an  element  is  defined  here  to  be  the  luminance 
of  the  element  minus  the  luminance  of  the  backgroiind  divided  by  the 
luminance  of  the  background.  (This  is  the  quantity  that  Shapley 
&  Enroth-Cugel  1985  called  Weber  contrast.)  The  horizontal  axes 
in  all  the  figures  show  the  contrast  ratio  of  the  variable- 
intensity  to  fixed  Intensity  elements. 

2.3.1  Area-ratios  and  background  luminances 

Three  experiments  Investigated  texture  segregation  as  a 
function  of  contrast  and  background  luminance.  In  one  experiment 
(Sutter,  1987) ,  the  textures  were  composed  of  either  two  16-pixel 
squares  or  of  one  16-  and  one  8-pixel  squares.  Thus,  the  element 
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area-ratios  in  the  first  experiment  were  1:1  and  4:1.  The  squares 
were  presented  on  a  white  (32.3  ft.-L)  and  on  a  black  (.05  ft.-L) 
background.  The  luminance  of  fixed- intensity  squares  was  set  at 
values  close  to  the  background  (28.1  ft.-L.  for  the  white 
backgroimd  and  .26  ft.-L.  for  the  black  background)  or  far  from  the 
background  (24.06  ft.-L.  for  the  white  background  and  2.44  ft.-L 
for  the  black  background) .  The  luminances  of  the  variable- 
intensity  squares  were  always  equal  to  or  below  the  white 
background  (.2  to  28.1  ft.-L  for  the  close  condition  and  .2  to 
24.06  ft.-L.  for  the  far  condition)  and  always  equal  to  or  above 
the  black  background  (.62  to  6.72  ft.-L.  for  the  close  condition 
and  2.44  to  32.2  ft.-L.  for  the  far  condition).  Figure  7  presents 
the  results  with  a  white  background  and  Figure  8  with  a  black 
background.  The  results  are  consistent  with  the  earlier  findings 
of  Beck,  Sutter,  and  Ivry  (1987)  and  are  in  accord  with  the 
predictions  of  the  simple  model.  For  the  textures  composed  of  the 
16  and  8  pixel  squares,  segregation  was  a  u-shaped  function,  with 
a  minimum  aroimd  the  point  at  which  the  squares  were  of  equal  areal 
contrast.  For  the  textures  composed  of  equal  size  squares  (both 
squares  16  pixels)  ,  segregation  increased  steadily  as  the  luminance 
of  one  sc[uare  increased.  The  increase  was  slower  on  a  background 
than  on  a  white  background.  A  possible  reason  for  this  difference 
is  that  because  of  early  response  compression  the  effective 
contrast  of  the  sc[uares  on  a  black  background  reaches  a  maximum 
which  does  not  increase  with  further  increases  in  luminance. 

Two  further  experiments  (Sutter,  Beck,  &  Graheun  1988) 
investigate  the  interaction  of  contrast  and  size  in  texture 
segregation  by  using  four  different  element  area-ratios:  1:1  (both 
scares  were  of  the  seuae  size— 16  pixels  or  approximately  16 
minutes  on  a  side),  1.78:1  (16  x  16  pixels  and  16  x  12  pixels),  4:1 
(16  X  16  and  8x8)  and  16:1  (16  x  16  and  4x4).  In  the  first 
experiment  (results  shown  in  top  panel  of  Figure  9) ,  the  patterns 
were  presented  on  a  black  background  of  .05  ft.-L  and  the  luminance 
of  the  fixed- intensity  squares  was  set  at  .26  ft.-L.  The  variaJsle- 
intensity  squares  were  always  edsove  the  background  and  ranged  from 
.26  ft.-L.  to  32.2  ft.-L.  In  the  second  experiment,  the  patterns 
were  presented  on  a  gray  background  of  16.1  ft.-L,  and  the 
luminance  of  the  fixed-intensity  squares  was  held  constant  at  2 
ft.-L.  edsove  or  below  the  background  (18.1  or  14.1  ft.-L., 
respectively) .  The  variable-intensity  squares  were  presented 
above  (middle  panel  of  Figure  9)  and  below  (bottom  panel  of  Figure 
9)  the  background.  The  luminanca  of  the  varledsle  squares  was  set 
at  1  of  7  values  ranging  from  .03  to  14.1  ft.-L.  for  the  squares 
below  the  backgroxind  and  from  18.1  to  32.2  ft.-L.  for  the  squares 
above  the  background.  The  element  contrast  ratios  varied  in  steps 
of  2‘^ 


Figure  9  shows  that  perceived  segregation  is,  in  general,  a 
U-shaped  function  with  the  minimvun  depending  on  the  relative  size 
of  the  square  and  shifting  to  larger  contrast  ratios  as  the  element 
area-ratio  increased.  For  an  area-ratio  of  1,  the  minimiam 
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perceived  segregation  occurred  when  the  contrast  ratio  was  1.  For 
an  area-ratio  of  1.78:1,  the  minimtim  occurred  between  1  and  2.  For 
4:1,  the  mininvua  occurred  between  2  and  4,  and  for  16:1,  the 
minimum  occurred  at  approximately  20:1.  This  pattern  of  results 
is  consistent  with  the  simple  model.  (The  minimum  is  not  predicted 
to  occur  precisely  at  the  point  of  equal  a  real  contrast  for  these 
patterns  because  the  large  squares  were  larger  than  the  space 
between  them.) 

Note  that  the  size  difference  of  the  squares  also  affected 
texture  segregation.  For  example,  the  perceived  segregation  at  the 
minimum  in  the  top  panel  (black  background)  Increased  as  the 
element  area-ratio  increased  ranging  from  .20  for  the  1:1 
area-ratio  to  1.85  for  the  16:1  area-ratio.  That  is,  the  steepness 
of  the  trough  decreased  as  area-ratio  increased.  This  dependence 
is  not  consistent  with  the  simple  model;  it  predicts  that  functions 
for  all  area-ratios  should  dip  down  to  approximately  the  same  very 
low  value.  We  will  return  to  this  dependence  in  our  discussion  of 
the  complex  model  (page  22) . 

Some  minor  differences  among  the  three  panels  of  Figure  9 
may  be  other  effects  of  response  compression  for  Iviminances  far 
above  the  background  luminance.  For  exeunple,  perceived  segregation 
increased  for  contrast  ratios  higher  than  that  at  the  minimum 
except  for  the  16:1  condition  where  the  luminance  of  the  small 
square  was  becoming  extremely  high  by  the  time  the  perceived 
segregation  reached  a  minimum. 

2.3.2  Lines  vs,  square  elements 

This  experiment  investigated  whether  there  are  differences 
in  perceived  segregation  as  a  function  of  whether  the  texture 
elements  are  squares  or  lines  (Sutter,  Beck,  &  Grediaun  (1988)  .  Four 
patterns  composed  of  squares  emd  four  composed  of  lines  were 
presented.  In  the  textures  composed  of  squares,  the 
fixed- intensity  elements  were  always  16  x  16  pixel  squares.  The 
variaUsle- intensity  elements  were  16  x  16,  11  x  11,  8x8,  or  4  x 
4  pixel  squares.  (Each  pixel  is  approximately  1  minute  of  visual 
angle.)  In  the  textures  composed  of  lines,  the  fixed- intensity 
elements  were  always  lines  of  width  2  pixels  and  height  16  pixels. 
The  other  elements  were  all  lines  of  width  2  pixels;  the  heights 
could  be  16,  11,  8,  or  4  pixels.  The  center-to-center  separation 
of  the  squares  and  lines  was  28  pixels.  The  variedDle-intensity 
elements  were  always  above  the  background.  The  luminances  of  the 
background  and  the  fixed-  and  variable-intensity  elements  were  the 
same  as  in  the  gray  background  experiment  in  Section  2.3.1  (page 
11). 


Figure  10  shows  the  mean  ratings  of  the  textures  composed  of 
squares  (top)  and  lines  (bottom) .  As  in  the  earlier  experiments, 
and  as  predicted  by  the  simple  spatial  frequency  model,  perceived 
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segregation  depends  on  differences  in  areal  contrast  (contrast 
times  area)  between  the  texture  elements  with  minimal  segregation 
occurring  at  the  point  where  the  areal  contrasts  of  the  large  and 
small  elements  are  approximately  the  same.  Patterns  composed  of 
squares  produced  better  segregation  than  patterns  composed  of 
lines.  This  result  was  also  found  by  Beck,  Prazdny  and  Rosenfeld 
(1983)  and  is  predicted  by  the  simple  spatial -frequency  model.  It 
follows  from  the  fact  that  the  lines  occupied  a  smaller  area  of  the 
pattern  than  the  squares  and  activation  of  the  spatial  frequency 
channels  was  thus  '^diluted'*  by  the  background. 

The  interaction  of  element  area  and  contrast  supports  the 
argument  that  the  segregation  of  patterns  composed  of  different 
arrangements  of  lines  is  not  attribut2d>le  to  "emergent"  features 
of  the  elements.  If  segregation  of  the  line  patterns  had  depended 
on  the  linking  of  the  longer  (16  pixel)  lines  into  emergent,  even 
longer  lines,  segregation  should  have  been  an  increasing  function 
of  the  contrast  ratio  between  the  elements,  regardless  of  their 
area-ratio,  since  greater  differences  in  contrast  would  have 
increased  the  linking  of  lines  based  on  similarity  of  contrast. 

For  the  textures  composed  of  squares,  the  size  difference  of 
the  squares  affected  perceived  segregation.  Figure  10  (top)  shows 
that  the  minimum  segregation  ratings  increased  with  the  size 
differences  between  the  squares  in  the  pattern.  This  result  is  not 
explained  by  the  simple  model.  For  the  textures  composed  of  lines, 
however.  Figure  10  (bottom)  shows  that  the  size  difference  of  the 
lines  did  not  yield  different  minimvua  segregation  ratings.  This 
is  in  accord  with  the  simple  model.  A  reason  for  this  difference 
in  results  will  be  proposed  when  we  present  the  complex  model  (page 
23)  . 

2.3.3  Pattern  density  (duty<ycle) 

Seventy  patterns  were  constructed  through  the  combination  of 
2  element  area-ratios  (1:1  and  4:1),  5  large-square-sizes  (12,  10, 
8,  6,  and  4  pixel  squares),  and  7  contrast-ratios. Thus,  for  the  1:1 
element  area-ratio  condition,  the  patterns  were  composed  of  two 
sets  of  equal  size  squares  of  12,  10,  8,  6,  or  4  pixels  on  a  side 
(12.96,  10.80,  8.64,  6.48,  and  4.32  min,  respectively).  In  the  4:1 
element  area-ratio  condition,  the  patterns  were  composed  of  12 
pixel  and  6  pixel  squares,  10  pixel  and  5  pixel  squares,  8  pixel 
and  4  pixel  squares,  6  pixel  and  3  pixel  squares,  or  4  pixel  and 
2  pixel  squares.  The  center-to-center  element  spacing  was  held 
constant  at  14  pixels  (15.12  min),  thus  creating  patterns  that  were 
approximately  4  degrees  in  width  and  height.  The  variable- 
intensity  elements  were  always  edsove  the  background.  The 
liiminances  of  the  background  and  the  fixed-  and  variable-intensity 
elements  were  the  same  as  in  the  gray  background  experiment  in 
Section  2.3.1  (page  11). 

Figure  11  (top  panel)  shows  the  mean  segregation  ratings  for 
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the  1:1  element  area-ratio  condition,  and  Figure  11  (bottom  panel) 
shows  the  mean  ratings  for  the  4:1  element  area-ratio.  As  in 
previous  experiments ,  perceived  segregation  for  the  4 : 1  element 
area-ratio  was  a  u-shaped  function  and  minimal  at  or  around  the 
point  at  which  the  areal  contrasts  of  the  large  and  small  squares 
were  equal.  For  both  the  1:1  and  4:1  element  area-ratios, 
perceived  segregation  decreased  with  decreasing  density  (decreasing 
element  size) .  This  result  is  consistent  with  our  simple  model. 
Decreasing  element  density  dilutes  the  effect  of  contrast 
differences  between  the  elements  because  there  is  more  background 
space.  This  would  result  in  greater  uniformity  in  the  outputs  of 
spatial-frequency  channels  corresponding  to  the  fundamental  period 
of  the  striped  and  checked  regions.  This  should,  and  in  fact  did, 
produce  poorer  segregation.  However  the  basic  tradeoff  between 
area  and  contrast  remains.  As  in  previous  experiments,  one  aspect 
of  the  results  was  not  predicted  by  the  simple  model.  Texture 
segregation  was  also  a  function  of  the  size  difference  between  the 
large  and  small  squares.  The  u-shaped  functions  are  shallower  as 
the  size  difference  between  the  large  and  small  squares  increased. 

2.3.4  Varying  the  fundamental  frequency  (scaling) 

Normally  texture  patterns  scale  in  the  sense  that  perceived 
segregation  remains  constant  with  changes  in  the  visual  angle 
subtended.  Wertheimer  (1923)  observed  that  the  grouping  of  a  set 
of  elements  does  not  change  with  viewing  distance  or  the 
magnification  of  the  pattern.  Green,  Wolf,  and  White,  (1959)  also 
found  that,  as  long  as  the  relative  element  sizes  and  separations 
remain  constant  the  absolute  size  of  the  pattern  does  not  affect 
texture  discrimination.  In  contrast,  Beck,  Prazdny  and  Rosenfeld 
(1983)  found  that  patterns  like  Figure  l  fail  to  scale.  When  the 
sizes  and  separations  of  the  squares  were  reduced  by  one-half, 
perceived  segregation  increased.  This  indicates  that  segregation 
depends  on  the  eUssolute  sizes  of  the  elements  and  their 
separations. 

This  observation  is  consistent  with  the  view  that  perceived 
segregation  is  mediated  by  the  outputs  of  spatial  frequency 
channels.  Contrast  sensitivity  is  generally  highest  at  a  spatial 
frequency  ranging  from  3-10  cycles/deg,  depending  on  experimental 
conditions.  The  standard  tripartite  patteim  (Figures  1  and  2)  had 
a  period  of  56  pixels  (the  distance  between  the  centers  of  two 
coliunns  of  the  same  type  of  square) ,  which  translates  to 
approximately  1  cycle/degree.  Figures  5  and  6  show  that  the 
spatial  frequency  channel  that  gives  the  best  information  for 
segregation  is  one  that  matches  the  period  of  the  pattern.  For  the 
standard  tripartite  pattern,  this  channel  would  have  peak  output 
at  a  spatial  frequency  of  around  1  eye le/ degree .  By  either 
increasing  the  viewing  distance  or  proportionately  decreasing  the 
sizes  of  the  elements  and  their  separations,  the  period  of  the 
pattern  can  be  decreased,  thus  increasing  the  spatial  frequency 


which  carries  the  most  information  about  differences  between  the 
striped  and  checked  regions  of  the  pattern.  Reducing  the  period 
of  the  pattern  should  increase  perceived  segregation,  up  to  the 
point  where  the  fundeunental  frequency  component  of  the  pattern  has 
a  spatial  frequency  at  the  peak  of  the  contrast  sensitivity 
function.  Further  reduction  of  the  period  of  the  pattern  should 
lead  to  a  decrease  in  perceived  segregation  because  the  fundamental 
frequency  component  of  the  pattern  will  be  of  a  spatial  frequency 
that  is  higher  than  that  at  the  peak,  thus  entering  a  range  where 
contrast  sensitivity  decreases. 

We  tested  this  prediction  of  the  spatial  frequency  model. 
The  period  of  the  tripartite  pattern  was  reduced  by  decreasing  the 
sizes  and  separation  of  the  squares  making  up  the  pattern.  The 
effects  of  contrast  differences  between  the  two  types  of  squares 
were  investigated  under  3  element  area-ratio  (1:1,  4:1,  and  16:1) 
and  4  fundamental  frequency  conditions  (1,  2,  4,  and  8 
cycles/ degree) . 

Seventy-seven  stimuli  were  constructed  through  the  partial 
combination  of  3  element  area-ratios  (1:1,  4:1,  and  16:1),  4 
fundamental-frequencies  (1,  2,  4,  and  8  eye les/ degree ) ,  and  7 
contrast-ratios.  The  3  element  area-ratio  conditions  were 
equal-size  squares  (1:1  element  area-ratio),  unequal-size  squares 
with  a  ratio  of  areas  of  4:1,  and  unequal-size  squares  with  a  ratio 
of  areas  of  16:1.  The  4  fxindamental-frequencies  were  1,  2,  4,  and 
8  cycles/degree,  which  corresponded  to  center-to-center  element 
separations  (one-half  periods)  of  28  pixels  (30.24  min) ,  14  pixels 
(15.12  min),  7  pixels  (7.56  min),  and  4  pixels  (4.32  min), 
respectively.  The  center-to-center  element  separation  was  held 
constant  at  1.75  times  the  width  of  the  largest  square  in  the 
pattern.  The  4  largest  square  sizes  were  16,  8,  4,  and  2  pixels  on 
a  side  (17.28,  8.64,  4.32,  and  2.16  min,  respectively).  The  effect 
of  reducing  the  element  size  and  separation  was  to  decrease  the 
size  of  the  whole  pattern,  as  well  as  its  period.  The  patterns 
with  a  fundamental  frequency  of  1  cycle/degree  measured  7.56 
degrees  in  height  and  width.  The  patterns  with 
fundamental-frequencies  of  2,  4,  and  8  cycles/deg  measured  3.78, 
1.89,  and  1.08  degrees,  respectively,  in  height  and  width.  The 
varieUsle- intensity  elements  were  always  above  the  backgrotind.  The 
liiminances  of  the  background  and  the  fixed-  and  variable-intensity 
elements  were  the  same  as  in  the  gray  background  experiment  in 
Section  2.3.1  (page  11). 

The  results  are  presented  in  Figure  12  for  the  1:1,  4:1, 
and  16:1  element  area-ratios.  As  in  the  previous  experiments, 
perceived  segregation  is  a  u-shaped  function  of  contrast  ratio, 
with  a  minimum  around  the  point  at  which  the  areal  contrasts  of 
the  two  texture  elements  are  equal.  Figure  13  shows  the  predicted 
segregation  values  produced  by  the  simple  spatial  frequency  model. 
A  comparison  of  the  predicted  and  observed  segregation  curves  shows 
that  the  general  shapes  of  the  observed  segregation  curves  for  each 
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element  area-ratio  are  fairly  well  predicted  by  the  model,  as  are 
the  contrast  ratios  at  the  points  of  minimiim  segregation  for  the 
three  element  area-ratios. 

Figures  12  shows  that  perceived  segregation  of  the  patterns 
varies  with  their  period.  some  aspects  of  this  variation  in 
perceived  segregation  are  predicted  by  the  simple  spatial  frequency 
model  (Figure  13).  According  to  the  model,  the  pattern  which 
should  be  most  easily  segregated,  at  a  given  contrast  ratio,  is 
that  whose  fundeunental  spatial  frequency  component  is  approximately 
4  cycles/degree.  Patterns  with  higher  or  lower  fundeuaental 
frequency  components  should  be  more  difficult  to  segregate.  This 
prediction  is  consistent  with  many  of  the  observed  segregation 
ratings,  with  a  few  striking  exceptions  in  the  4:1  and  16:1  element 
area-ratio  conditions.  The  effects  of  changing  the  fundeunental 
frequency  were  also  found  with  patterns  of  constant  size.  Instead 
of  reducing  the  overall  size  of  the  patterns,  the  overall  size  of 
the  patterns  was  maintained  by  increasing  the  number  of  rows  and 
columns  of  elements  (Sutter  1987) . 

The  contrast-ratio  at  which  minimum  observed  segregation  occurs 
in  these  conditions  is  well-predicted  by  the  model,  but  notice  that 
the  observed  segregation  curves  cross-over  dreunatically,  while  the 
predicted  curves  just  move  up  or  down,  for  the  most  part,  depending 
on  fundamental  frequency.  For  very  low  contrast  ratios,  the 
ordering  of  curves  is  roughly  from  the  most  visible  on  top  to  the 
least  visible  fundeunental  frequencies  (as  was  built  into  the  model 
by  weighting  the  predictions  by  a  particular  sensitivity  function) . 
In  the  trough,  however,  the  ordering  is  different;  patterns  having 
low  fundamental  frequencies  still  segregate  quite  well 
(contradicting  the  model)  while  patterns  with  high  fundeunental 
frequencies  don't  segregate  at  all  (agreeing  with  the  model) . 

This  discrepancy  between  the  simple  model's  predictions  and 
the  results  of  varying  fundamental  frequency  is  probeJ>ly  closely 
related  to  another  discrepancy  mentioned  near  the  beginning  of  this 
section — namely,  for  patterns  with  a  fundamental  frequency  of  eU30ut 
1  c/deg,  the  dip  in  the  u-shaped  function  becomes  less  pronounced 
as  the  area-ratio  becomes  higher  or,  equivalently,  as  the 
difference  between  the  sizes  of  the  squares  becomes  larger.  The 
relation  between  the  two  discrepancies  will  become  clearer  below 
when  we  propose  the  complex  model  (page  24) .  Roughly,  the  idea  is 
this:  for  low  fundamental  spatial  frequencies  of  textures,  the 
information  at  the  higher  harmonics  (transmitted  by  channels 
sensitive  to  the  higher  harmonics  (e.g.,  the  right  hand  side  in 
Figures  5  and  6)  is  available  to  the  obseirver  and  does  help  to 
segregate  the  patterns  (although  this  kind  of  information  is 
totally  ignored  by  the  simple  model) .  At  high  fundamental  spatial 
frequencies,  however,  the  higher  harmonics  of  the  texture  patterns 
are  invisible  to  the  observer  so  cannot  contribute. 
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2.3.5  Squares,  circles,  blobs,  and  aligned  vs.  non-aligned  squares 

Perceived  segregation  of  the  tripartite  pattern  depends  on 
the  detection  o£  the  difference  in  the  arrangement  of  elements  in 
the  striped  and  checked  regions  of  the  patterns.  The  channels 
showing  strikingly  different  outputs  to  the  different  arrangement 
of  elements  in  the  striped  and  checked  regions  are  at  the 
fxindamental  period  of  the  pattern.  The  outputs  from  the  higher 
spatial- frequency  channels  have  very  little  affect  on  perceived 
segregation.  Although  the  pattern  of  activity  is  distributed 
differently  in  the  striped  and  checked  regions,  the  eunounts  of 
modulated  activity  in  the  striped  and  checked  regions  are  similar 
at  the  higher  spatial-frequencies.  Channels  t\ined  to  the  higher 
spatial-frequencies  respond  to  the  edges  of  the  elements  in  a 
pattern.  Therefore,  altering  the  contours  of  the  elements  should 
only  have  a  minor  effect  on  the  perceived  segregation. 

An  alternative  explanation  proposed  by  Grossberg  and  Mingolla 
(1985)  explains  perceived  segregation  of  the  tripartite  pattern  as 
the  result  of  complex  interactions  within  a  Boundary  Counter  System 
(BC)  system.  According  to  Grossberg  and  Mingolla,  the  boundaries 
generated  by  the  BC  system  need  not  be  visible.  In  the  case  of 
tripartite  patterns,  Grossberg  and  Mingolla  argue  that  the  BC 
system  creates  invisible  elongated,  vertical,  boundaries  in  the  top 
and  bottom  regions  and  Invisible  diagonal  boundaries  in  the  middle 
region  of  the  tripartite  pattern.  These  invisible  boundaries  are 
the  basis  for  the  perceived  segregation  of  the  tripartite  pattern. 
Perceived  segregation  should,  therefore,  be  significantly  affected 
by  changing  the  contours  of  the  elements.  The  invisible  boundaries 
formed  in  the  BC  system  would  be  expected  to  be  formed  more 
strongly  when  the  elements  of  the  patterns  are  aligned  than  when 
they  are  not  aligned. 

Two  experiments  investigated  the  effect  of  element  contour 
alignment  on  perceived  segregation.  In  the  first  experiment,  the 
elements  were  squares,  circles,  and  blobs.  Two  area-ratios  were 
investigated:  1:1  and  4:1.  In  the  patterns  composed  of  squares, 
the  fixed- intensity  square  was  16  pixels  on  a  side.  The  variable- 
intensity  squares  were  either  16  or  8  pixels  on  a  side.  The  circle 
and  blob  stimuli  were  equated  in  area  to  the  squares.  Figure  14 
shows  the  1:1  square,  circle,  and  blob  patterns.  The  center-to- 
center  separation  of  the  elements  was  28  pixels.  The  variable 
intensity  elements  were  always  abovB  the  background.  The 
liimlnances  of  the  backgrotind  and  the  fixed-  and  variable-intensity 
elements  were  the  same  as  in  the  gray  background  experiment  in 
Section  2.3.1  (page  11).  (The  patterns  are  shown  on  a  dark  gray 
background  to  facilitate  copying.) 

Figure  15  presents  the  mean  segregation  ratings  for  the  1:1 
area-ratio  (top  panel)  and  the  4:1  area-ratio  (bottom  panel). 
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Edge  alignment  failed  to  affect  perceived  segregation.  The  curves 
for  the  square,  circle  and  blob  elements  are  highly  similar.  The 
area  x  contrast  tradeoff  indicates  that  it  is  differences  in  the 
outputs  of  spatial- frequencies  approximating  the  fundamental  period 
of  the  pattern  and  not  differences  in  the  higher  spatial- 
frequencies  that  are  important  for  perceived  segregation. 

In  the  second  experiment,  the  perceived  segregation  of  aligned 
and  non-aligned  squares  were  compared.  The  textures  were  composed 
of  squares  16  pixels  on  a  side.  Figure  16  shows  exeunples  of  the 
aligned  (top  panel)  and  the  non-aligned  arrangements  (bottom 
panel) .  The  patterns  were  composed  of  one-element-only, 
opposite-sign-of-contrast,  and  same-sign-of-contrast  elements.  The 
elements  composing  the  patterns,  the  luminance  conditions,  and  the 
form  in  which  the  data  are  plotted  is  explained  in  Section  2.5 
(page  21).  Figures  17  and  18  present  the  results.  No  significant 
differences  in  perceived  segregation  occurred  as  a  function  of  the 
alignment  of  the  squares.  If  segregation  occurred  due  to  the 
contour  interactions  in  the  BC  system,  as  suggested  by  Grossberg 
and  Mingolla,  perceived  segregation  should  have  been  significantly 
reduced  in  the  pattern  composed  of  the  non-aligned  squares. 

2.4  Patterns  with  no  energy  at  the  fundamental:  results  and 
predictions  of  simple  model 

In  this  experiment,  the  patterns  contained  elements  that  were 
either  uniform  squares  (14  x  14  pixels)  or  were  center-surround 
elements  composed  of  a  center  square  (10  by  10,  pixels)  surrounded 
by  a  square  fr2UBe  2  pixels  wide  so  that  total  dimensions  of  the 
element  were  14  by  14.  The  center-to-center  separation  of  the 
texture  elements  was  26  pixels.  The  contrasts  of  the  centers  and 
annuli  of  the  center-surround  elements  were  always  equal  but  of 
opposite  sign  so  that  the  average  Iviminance  of  the  center-surround 
element  was  the  same  as  the  background. 

Some  patterns--  called  "opposite-sign-of-contrast  patterns" 
below — were  constructed  so  that  the  centers  of  the  two  types  of 
texture  elements  were  of  opposite  sign-of-contrast.  In  the  case 
of  center-surround  elements  (Figure  19  top)  one  element  consisted 
of  a  higher  intensity  center  and  a  lower  intensity  annulus;  the 
other  element  of  a  lower  intensity  center  and  a  higher  intensity 
annulus.  In  the  case  of  solid  squares,  one  element  consisted  of 
sc(uares  brighter  than  the  background  and  one  of  squares  darker  than 
the  background.  Other  patterns— called  "one-element-only  patterns" 
below — contained  only  one  kind  of  element  (as  shown  in  Figure  19 
bottom) .  In  this  case  that  type  could  either  have  the  center 
(which  in  the  case  of  \iniform  squares  implies  the  whole  square) 
brighter  than  the  background  or  dimmer  than  the  background.  We 
also  used  various  unbalanced  combinations  which  will  not  be 
discussed  further  although  they  might  prove  useful  in  testing  our 
complex  model.  The  background  was  set  at  16.1  ft.-L. 
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Some  of  the  results  are  shown  in  Figure  20.  The  horizontal 
axis  for  each  panel  shows  the  difference  In  luminance  between  the 
center  of  an  element  (which  In  the  case  of  the  square  was  the  whole 
square)  and  the  background.  For  opposite-sign-of -contrast  patterns 
using  center-surround  elements,  segregation  was  poor,  regardless 
of  the  level  of  contrast  (Figure  20  upper  left) .  When  the  elements 
were  uniform  squares  of  opposite  sign  of  contrast  (Figure  20  upper 
right) ,  on  the  other  hand,  segregation  was  good  (as  previously 
reported  by  Beck,  Sutter,  &  Ivry,  1987) .  For  one-element-only 
patterns,  perceived  segregation  was  very  good  both  for 
center- surround  elements  (Figure  20  lower  left)  and  for  solid 
squares  (Figure  20  lower  right) . 

These  results  are  not  in  accord  with  the  predictions  of  the 
simple  model  in  several  ways.  The  most  dr2unatic  is  that,  since 
the  center-surround  elements  average  out  to  the  background 
luminance,  no  stimulus  information  is  availadsle  at  the  fundamental 
frequency  of  the  pattern.  Thus,  according  to  the  simple  model,  no 
patterns  based  on  center-surround  elements  should  segregate  well. 
However,  the  one-element-only  patterns  do.  This  shows  that 
perceived  segregation  can  occur  based  on  information  contained  only 
in  the  higher  harmonics  of  the  texture  pattern. 

2.5  Patterns  with  same-  and  opposlte-sign-of-contrast:  results 
and  predictions  of  simple  model 

An  intriguing  result  that  does  not  fit  the  fr2unework  of 
low-level  channels  without  separate  consideration  of  "on”  and  "off" 
responses  is  that  patterns  with  squares  of  opposite-sign 
-of-contrast  yield  such  good  texture  segregation  compared  to  same- 
sign-of-contrast  patterns  (Beck,  Sutter,  &  Ivry,  1987) .  One 
possibility  is  that  sign  of  contrast  is  a  feature,  i.e.  positive 
and  negative  contrasts  are  encoded  in  different  feature  maps. 
Perceived  segregation  of  opposite-sign-of-contrast  patterns  should 
then  be  similar  to  one-element-only  patterns.  An  experiment 
investigated  whether  the  strong  texture  segregation  that  occurs 
when  the  squares  are  of  opposite  contrast-sign  involve  different 
mechanisms  from  those  when  the  squares  are  same  sign-of-contrast. 

One-Element-Onty,  Opposite-sign-of-Contrast,  and  Same-sign~of-Contrast  Solid 
Elements — Figure  21  shows  portions  from  the  luminance  profiles  of 
each  of  five  different  patterns.  Each  profile  shows  one  of  each 
of  the  two  kinds  of  elements  composing  the  texture.  In  all  of 
these  patterns  both  kinds  of  elements  are  squares  of  the  same  size. 
Also  the  background  luminance  is  the  seune  in  all  cases  (16.1  ft.- 
L. )  .  The  element  luminance  changes  from  pattern  to  pattern  but 
with  the  following  important  constraint:  the  difference  between 
the  luminances  of  the  two  squares  is  the  same  in  all  cases.  Hence 
we  will  call  the  stimuli  in  Figure  21  a  "constant -Li-minimum-Lj"  or 
"constant-difference"  series  of  stimuli. 
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In  this  experiment,  we  used  66  stimuli  of  the  five  types 
shown  in  Figure  21.  The  luminance  difference  between  the  squares 
could  be  0.0,  .75,  1.50,  2.25,  3.00,  and  3.75  ft.-L.  Figure  22 
plots  the  mean  ratings  (except  stimuli  for  which  the  difference 
was  zero,  all  of  which  led  to  very  low  ratings  between  0  and  .16). 
The  vertical  axis  shows  the  mean  segregation  ratings.  Each  curve 
connects  the  results  for  a  constant-difference  series  of  stimuli. 
The  horizontal  axis  is  simply  a  convenient  monotonic  transformation 
of  contrast  ratio  into  a  quantity  we  called  the 
contrast-ratio-angle.  (The  contrast-ratio  angle  is  equal  to  135 
degrees  minus  arctan  of  contrast  ratio.)  A  contrast-ratio  angle 
of  zero  represents  squares  of  opposite  contrast-sign  with  the 
luminances  of  the  two  squares  equally  eOsove  and  below  the 
backgroimd.  Points  plotted  between  0  and  ->-45  degrees  represent 
stimuli  of  opposite  sign-of -contrast  in  which  the  contrast  of  the 
square  above  the  background  is  increasingly  greater  than  the 
contrast  of  the  square  below  the  background.  At  +45  degrees  the 
luminance  of  the  square  below  the  background  is  equal  to  the 
background  and  a  pattern  was  composed  of  a  single  square  eUQove  the 
background.  Points  plotted  between  45  and  90  degrees  represent 
stimuli  in  which  the  luminances  of  both  squares  are  eUsove  the 
background.  As  one  moves  towards  90  degrees  the  ratio  of  the 
contrasts  of  the  two  squares  approach  1.0.  At  90  degrees  the 
contrasts  of  the  two  sc[uares  are  equal. 

Figure  22  shows  that  textures  consisting  of  a  single  square 
segregated  most  strongly,  followed  by  textures  composed  of  elements 
of  opposite-sign-of-contrast  (thereby  replicating  the  results  in 
the  right  panels  of  Figure  20) .  Textures  composed  of  elements  of 
the  same-sign-of-contrast  segregated  least.  Figure  22  shows  that 
the  textures  with  a  single  square,  squares  of 
opposite-sign-of-contrast,  and  the  squares  of  the 
same-sign-of-contrast  all  lie  on  a  continuum.  A  small  change  in 
the  stimulus  produces  a  small  change  in  perceived  texture 
segregation.  The  continuity  of  the  functions  is  consistent  with 
the  hypothesis  that  texture  segregation  is  a  function  of  the 
outputs  of  spatial-frequency  channels  rather  than  separate  on  and 
off  mechanisms.  Texture  composed  of  single  types  of  elements  and 
of  elements  of  opposite  contrast-sign  are  not  categorically 
different  from  textures  composed  of  elements  with  the  same 
contrast-sign.  However,  an  exauaination  of  the  output  of  the 
spatial-frequency  chaiuiels  shows  that  they  cannot  explain  the  data 
completely.  Figure  23  (in  the  same  form  as  Figure  22)  shows  the 
predictions  of  the  simple  model.  As  can  be  seen  all  stimuli  in 
a  constant-difference  series  are  predicted  to  segregate  to 
approximately  the  seune  extent.  This  occurs  because  the  outputs  of 
the  most  active  channels  (those  txined  to  frequencies  near  the 
fundamental)  are  the  same  to  all  stimuli  in  such  a  series.  For 
these  channels,  the  background  falls  equally  on  the  Inhibitory  and 
excitatory  portions  of  the  receptive  field  so  it  is  only  the 
squares  themselves  that  count.  When  the  difference  between  the 
Iximinances  of  the  squares  is  held  constant,  the  modulated  activity 
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in  the  output  of  the  channel  is  constant.  The  outputs  of  other 
channels  are  different  for  different  members  of  the  series,  but 
these  channels  contribute  very  little  to  the  predictions  of  the 
simple  model.  Thus,  the  greater  perceived  segregation  of  the  one- 
element-only  patterns  than  of  the  opposite-sign-of-contrast 
patteims  and  of  the  opposite-sign-of-contrast  patterns  than  of  the 
seune-sign-of -contrast  patterns  is  not  consistent  with  the  simple 
model . 


Another  discrepancy  between  predictions  and  results  occurs 
for  same-sign-of -contrast  patterns  (toward  the  left  and  right 
edges  of  Figures  23  and  22) .  Two  patterns  having  the  same 
contrast-ratio  (£2*^0)  /  different  luminance  differences 
L2-L1  are  not  predicted  to  segregate  to  the  seune  extent;  that  is, 
the  curves  in  Figure  23  are  vertically  displaced  at  the  left  and 
right  edges  as  well  as  in  the  middle.  But  two  such  patterns  did 
tend  to  segregate  to  the  same  extent  in  the  experimental  results- 
-that  is,  the  different  curves  in  Figure  22  converge  at  the  left 
and  right  edges  of  the  figure  so  that  for  large  luminance 
differences,  the  size  of  the  Ituainance  difference  (which  curve  the 
pattern's  point  is  on)  ceases  to  affect  perceived  segregation;  only 
contrast-ratio  (the  point's  horizontal  coordinate)  does. 

The  two  discrepancies  just  mentioned  suggest  two  different 
modifications  of  the  simple  model  may  be  necessary.  As  described 
earlier,  the  one-element-only  advantage  suggests  using  the 
information  in  the  higher-harmonic  channels.  The  second 
discrepancy  suggests  instead  a  modification  of  the  assumption  that 
each  filter's  output  (and  hence  the  simple-model's  predicted  value) 
is  linear  with  luminance.  Standard  results  on  "light  adaptation" 
make  it  clear,  of  course,  that  such  modification  is  eUosolutely 
necessary  in  any  complete  model  of  the  visual  system.  It  is  only 
in  the  case  of  patterns  in  which  the  range  of  luminances  and 
contrasts  is  small  enough  that  one  can  hope  for  linearity. 

2.6  Complex  spatial -frequency  channels  model 

Two  aspects  of  the  results  reported  in  Section  2 . 3  are 
different  from  those  predicted  by  a  simple  spatial-frequency 
channels  model,  i.e.,  a  linear  pooling  of  within-filter  differences 
weighted  by  the  contrast  sensitivity  function  (page  9)  .  First,  the 
fact  that  the  minima  in  the  functions  in  Figures  9,  10  (top  panel)  , 
11  (bottom  panel) ,  and  12  (right  panel)  is  not  the  saune  for  all 
area-ratios  but  varies  with  the  size  difference  between  the  large 
and  small  squares  making  up  a  pattern.  The  minimvim  is  less  as  the 
size  difference  between  the  large  and  small  squares  is  larger.  If 
areal  contrast  were  the  only  factor  affecting  segregation,  the 
segregation  ratings  should  have  been  the  seune  when  the  area  x 
contrast  between  the  large  and  small  squares  composing  a  pattern 
were  equated.  Second,  the  fact  that  the  curves  in  Figure  12  (right 
panel)  cross.  Figure  13  shows  that  the  functions  predicted  by  the 
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simple  model  do  not  cross. 

Further  experiments  with  textures  containing  balanced 
elements  with  no  energy  at  the  fundamental  frequency  (Section  2.4) 
and  with  textures  containing  elements  of  opposite-sign-of-contrast 
(Section  2.5)  also  gave  results  which  are  not  consistent  with  the 
simple  model  we  proposed.  These  discrepancies  suggest  that  the 
simple  model  does  not  make  sufficient  use  of  information  in  the 
channels  sensitive  to  the  higher~harmonics  of  the  pattern.  One  way 
in  which  the  information  in  the  higher-harmonic  channels  may  be 
used  involves  a  more  complicated  spatial-freguency-channels  model. 
This  model  is  illustrated  in  Figure  24.  Each  channel  in  this  model 
contains  three  stages:  linear-filtering  followed  by  a  non-linear 
function  such  as  a  rectification,  followed  by  a  second  linear 
filter.  The  filters  in  both  stages  are  selective  for 
spatial-frequency  and  orientation.  A  large  number  of  such  channels 
tuned  to  various  spatial  frequencies  and  orientations  are  assumed 
to  exist.  We  call  these  channels  "complex  channels"  to  distinguish 
them  from  the  simple  channels  of  the  earlier  model,  and  because 
this  kind  of  channel  is  similar  to  current  models  of  complex  cells, 
(e.g.  Hochstein  and  Spltzer,  1985;  and  Hochstein,  1985a,  b)  . 
Measures  of  pooled  second-stage  filter  activity  (analogous  to  that 
in  the  simple  model)  are  taken  to  be  the  predictor  of  perceived 
texture  segregation. 

Linear  filtering  followed  by  a  rectifying  nonlinearity  has 
been  proposed  by  Grossberg  &  Mingolla  (1985) ,  Chubb  &  Sperling 
(1988)  ,  and  Bergen  &  Adelson  (1988).  Grossberg  and  Mingolla  (1985) 
have  proposed  a  model  containing  the  three  stages  of  the 
complex-model  plus  additional  processes,  and  have  suggested  that 
their  model  accounts  for  perceived  segregation  in  patterns  such  as 
ours.  Their  demonstrations,  however.  Involve  only  filterings 
sensitive  to  the  higher  harmonics  of  the  pattern  (small  receptive 
fields  relative  to  the  pattern  periodicity) .  On  the  basis  of  the 
findings  reported  in  Section  2.3.5  (page  18),  we  believe  that  it 
is  unlikely  that  higher  harmonic  Information  is  the  major 
determinant  of  perceived  segregation.  The  trade-off  between  area 
and  contrast  also  suggests  that  the  low  frequencies  matching  the 
period  of  the  pattern  are  important  in  texture  segregation. 

The  complex  channel  model  and  the  effects  of  light  adaptation 
appear  able  to  account  qualitatively  for  all  our  discrepancies. 
Specifically,  let's  exeunlne  in  some  detail  how  the  complex  channel 
model  explains  the  finding  that  the  greater  the  size  difference 
between  squares,  the  greater  the  minimal  segregation  rating. 
Consider  a  complex  channel  in  which  the  first  filtering  is 
sensitive  to  a  very  high  spatial  frequency  while  the  second 
filtering  (after  the  nonlinearity)  is  sensitive  to  a  spatial 
frequency  near  the  fundamental  of  the  pattern.  Such  a  complex 
channel  can  respond  to  what  might  be  called  "low-spatial-frequency 
patterns  of  high-spatial-frequency  elements".  For  our  exeunple, 
we  will  consider  the  channel  where  both  stages  are  sensitive  to 
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vertical  orientations.  The  first  high-frequency  filtering 
extracts  the  edges  of  the  scjuares  as  illustrated  by  the  narrow  dark 
and  light  bands  in  the  right  panels  of  Figures  3  and  4.  The 
nonlinearity  then  changes  belov-zero  first-stage  outputs  (dark 
bands  in  Figures  3  and  4)  either  to  zero  outputs  (if  half-wave 
rectification  or  something  similar  is  assumed)  or  to  eibove-zero 
outputs  (if  full-wave  rectification  or  something  similar  is 
assumed) .  Since  the  large  squares  have  longer  edges  than  do  the 
small  squares,  the  second-stage  filter  (which  is  sensitive  to  the 
fundeunental  of  the  pattern  and  to  vertical  orientations)  will  have 
greater  outputs  where  there  are  columns  of  large  squares  than  where 
there  are  columns  of  small  squares,  when  the  contrasts  of  the 
squares  is  equal.  Remember  that  the  area  of  squares  increases 
quadratically  with  edge  length.  Therefore,  at  the  contrast  ratios 
where  area  x  contrast  of  the  large  and  small  squares  are  equated, 
the  edge-length  x  contrast  of  the  small  square  will  actually  be 
greater  than  that  of  the  large  square.  The  amount  by  which  it  is 
greater  will  be  larger  for  16  and  8  pixel  squares,  for  example, 
than  for  the  16  and  12  pixel  squares.  In  general,  when  both  edge 
lengths  and  areas  count,  as  in  the  complex  channels  model,  the 
edges  should  attenuate  the  dip  in  the  u-shaped  fianction  more  for 
larger  size  discrepancies  between  squares  (as  long  as  lower 
sensitivity  to  smaller  size  does  not  cancel  out  the  greater  size 
difference) . 

A  consequence  of  our  hypothesis  is  that  if  lines  (rectangles) 
are  the  texture  elements  (a  rectangle's  area  increases  linearly 
with  edge-length) ,  the  minimum  texture  segregation  should  be  the 
same  for  different  area-ratios  of  the  rectangles.  When  the  area 
X  contrast  of  the  large  and  small  rectangles  are  equated  (and  the 
first-stage  filterings  sensitive  to  the  fiindamental  show  little 
modulated  activity) ,  the  edge-length  x  contrast  is  also  equated  so 
the  complex  channel  also  shows  little  modulated  activity.  Figure 
10  compares  the  texture  segregation  of  patterns  composed  of  4  area- 
ratios  of  squares  (top  panel)  and  lines  (bottom  panel) .  For 
textures  composed  of  squares,  the  greater  the  size  difference 
between  the  squares  the  shallower  the  trough,  i.e.,  the  minimum 
segregation  was  greater.  For  textures  composed  of  lines,  the  size 
difference  of  the  lines  did  not  yield  different  minimum  segregation 
ratings . 

The  cross-over  of  the  f\inctions  for  different  fundeunental 
frequencies  in  Figure  12  (right  panel)  may  also  be  explained  by 
the  complex  channels  model.  Consider  the  1  c/deg  and  4  c/deg 
fundamental  frequencies.  Because  the  higher  harmonics  of  the  4 
c/deg  patterns  fall  in  a  range  of  low  contrast  sensitivity,  the 
perceived  segregation  of  these  patterns  is  primarily  determined  by 
Initial  filtering  at  the  fundeunental  frequency.  Thus,  when  the 
area  x  contrast  of  the  large  and  small  squares  is  equated,  the 
rated  segregation  in  Figure  12  (right  panel)  la  close  to  zero.  For 
the  1  c/deg  patterns,  however,  the  higher  harmonics  are  in  a  more 
sensitive  range  of  the  contrast  sensitivity  function  and  can 
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contribute  to  perceived  segregation.  According  to  the 
complex-channels  model,  the  influence  of  the  squares'  edges  (picked 
up  by  the  initial  filtering  at  the  higher-harmonic  frequencies 
followed  by  rectification  and  refiltering  at  the  fundamental 
frequency)  will  attenuate  the  dip  in  the  u-shaped  function.  As  one 
moves  away  from  the  point  of  equal  areal  contrasts,  the  differences 
in  the  outputs  of  the  initial  filtering  increase  more  rapidly  for 
the  patterns  having  a  fundamental  of  4  c/deg  than  of  1  c/deg 
because  4  c/deg  is  closer  to  the  optimum  of  the  contrast 
sensitivity  function.  Thus  the  curves  should  cross. 

It  is  interesting  to  note  that  a  condition  where  the  area- 
ratio  of  large  and  small  squares  was  4:1  while  the  sizes  of  the 
squares  varied  appears  both  in  the  density  experiment  (Figure  11 
bottom  panel)  and  the  scaling  experiment  (Figure  12  right  panel) . 
In  the  density  experiment,  however,  the  period  of  the  pattern  was 
kept  constant  while  in  the  scaling  experiment,  the  period 
increased  proportionately  to  the  sizes  of  the  squares.  In  both 
cases,  the  minimum  ratings  increased  as  the  sizes  of  the  squares 
(and  thus  the  difference  between  the  sizes)  increased.  However, 
the  cross-overs  occurred  only  when  the  fundeunental  frequency 
changed.  This  suggests  that  it  is  the  changing  contribution  of 
edges  due  to  their  weighting  by  contrast  sensitivity,  rather  than 
the  smaller  size  differences  between  the  squares  that  produces  the 
cross-overs  in  Figure  15  (right  panel) . 

2.7  Comparison  of  perceived  segregation  and  perceived  lightness 

A  striking  finding  reported  by  Beck,  Sutter,  &  Ivry  (1987) 
suggested  that  squares  of  equal  size  differing  clearly  in  lightness 
failed  to  yield  good  perceived  segregation.  The  observations  of 
the  experimenters  further  suggested  that  equal  lightness 
differences  yielded  very  different  degrees  of  segregation  depending 
on  the  background  luminance.  When  the  background  luminance  is 
ed30ve  that  of  the  objects,  lightness  is  agreed  to  be  a  logarithmic 
function  of  reflectance  or  relative  luminance  to  a  first 
approximation,  i.e.,  the  ratio  of  the  luminance  of  the  square  to 
the  luminance  of  the  background  or  to  the  average  luminance  (Judd 
&  Wyszecki,  1963;  Kelson,  1964).  Figure  25  from  Beck,  Sutter,  and 
Ivry,  1987)  gives  mean  perceived  segregation  ratings  as  a  fvmction 
of  the  ratio  of  the  luminances  of  the  two  squares  (top  panel)  and 
as  a  function  of  the  ratio  of  the  contrasts  of  the  two  squares 
(bottom  panel) .  [For  this  figure,  contrast  was  actually  computed 
as  the  difference  between  the  square  luminance  and  the  mean 
luminance  of  the  whole  pattern  divided  by  the  mean  luminance— that 
is,  an  extension  of  the  quantity  called  Rayleigh  contrast  by 
Shapley  and  Enroth-Cugel  (1985)  for  periodic  patterns.]  Different 
curves  represent  different  ratios  of  the  background  to  the  higher 
of  the  two  square  Iximinances.  The  different  curves  coincide  when 
plotted  against  contrast  ratio  (bottom  panel)  but  not  when  plotted 
against  Iximinance  ratio  (top  panel) .  Since  it  is  luminance  ratio 
that  is  thought  to  determine  lightness,  this  suggests  that  the 
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strength  of  perceived  texture  segregation  is  not  determined  by  the 
magnitude  of  the  lightness  difference  between  squares. 

Lightness  Matches—  Two  experiments  were  conducted  in  which 
sxabjects  both  rated  the  perceived  texture  segregation  of  a  pattern 
and  also  matched  the  lightnesses  of  each  of  the  two  kinds  of 
squares  composing  a  pattern  to  chips  of  varying  Munsell  lightness 
values  (Beck,  Sutter  &  Gredieun,  1988) .  In  the  first  experiment,  the 
background  was  the  seune  for  all  the  patterns  (30  ft.-L. ),  and  the 
squares  were  always  darker  than  the  background.  Figure  26  shows 
the  segregation  ratings  plotted  against  the  difference  between  the 
lightness-match  values  for  the  two  squares  in  the  pattern. 
Clearly,  equal  lightness  difference  lead  to  different  segregation 
judgement  depending  on  the  ratio  of  the  background  luminance  to  the 
higher  square  luminance. 

Figure  27  shows  the  rated  segregations  (top  row)  and  the 
differences  between  the  lightness  matches  (bottom  row)  plotted  as 
a  fxinction  of  the  ratio  of  the  luminances  of  the  two  squares  (left 
column)  and  of  the  ratio  of  the  Weber  contrasts  of  the  two  squares 
(right  column)  .  Different  curves  in  the  upper  left  and  lower  right 
show  results  for  different  rations  of  background  luminance  to  the 
higher  square  luminance.  In  the  upper  right  and  lower  left  panels 
the  results  all  fell  on  the  same  function  (within  experimental 
error)  and  so  are  not  distinguished.  In  short,  perceived  texture 
segregation  appears  to  be  a  single-valued  function  of  the  ratio 
of  the  contrasts  of  the  two  squares  while  the  difference  between 
the  lightness  matches  appears  to  be  a  single-valued  function  of  the 
ratio  of  the  luminances  of  the  two  squares. 

In  a  second  experiment,  three  backgroxinds  were  used:  a  black 
backgroiind  (.99  ft.-L.),  a  gray  background  (16.1  ft.-L.),  and  a 
white  background  (40  ft.-L.).  The  squares  were  always  above  the 
black  background,  adiove  and  below  the  gray  background  (opposite- 
sign-of-contrast,  and  below  the  white  background.  The  results  were 
similar  to  the  first  experiment.  The  fiinctions  describing  the 
lightnesses  of  the  squares  differed  from  the  f\2nctions  describing 
the  texture  segregation  of  the  patterns.  Perceived  texture 
segregation  tended  to  be  a  monotonic  function  of  the  contrast  ratio 
of  the  squares.  With  the  gray  background  (opposite-sign-of- 
contrast  squares  whose  contrast  ratio  was  always  1)  ,  perceived 
segregation  decreased  when  the  luminances  of  the  squares  were  close 
to  the  background.  The  perceived  lightness  of  the  squares  was  a 
monotonic  fiinction  of  the  ratio  of  the  liominances  of  the  squares. 


Is  perceived  lightness  computed  independently  of  and/or  at 
a  later  stage  from  the  stage  controlling  perceived  texture?  In 
some  sense,  the  answer  seems  likely  to  be  yes.  For  lightness  is 
clearly  seen  as  a  property  clearly-perceived  squares  and  each 
square  is  seen  as  of  uniform  lightness.  Yet  in  our  models  of 
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perceived  texture  segregation,  low-level  channels  are  primarily 
responsible  for  perceived  texture  segregation  and  these  channels 
do  not  have  the  right  properties  to  signal  the  presence  of 
homogeneously-illvminate  "squares" in  the  pattern.  On  the  other 
hand  the  fact  that  both  perceived  texture  segregation  and  perceived 
lightness  are  functions  of  related  stimulus  properties  (at  least 
through  some  range  of  conditions)  suggests  a  closer  relationship 
of  some  sort.  The  similarity  of  the  contrast  computations  to  the 
weighting  functions  of  the  receptive  fields  suggests  that  they  are 
computed  first  and  that  the  luminance  ratio  determining  perceived 
lightness  is  computed  from  them.  Neural ly,  the  contrast 
computations  can  be  interpreted  as  comparing  the  excitation 
produced  by  a  patch  with  a  base  firing  rate.  For  a  square  on  a 
larger  background,  one  can  asstime  that  the  base  rate  is  set 
primarily  by  the  background.  The  contrast  calculation  computes  how 
much  the  firing  rate  due  to  the  luminance  of  a  square  deviates  from 
the  base  rate  set  by  the  luminance  of  the  background.  It  is  the 
relevant  computation  for  determining  the  perceived  strength  of 
texture  segregation.  Perceived  lightness  is  determined  to  a  first 
approximation  by  the  ratio  of  the  luminance  of  a  square  to  the 
luminance  of  the  background.  It  is  based  on  what  proportion  of  the 
base  rate  set  by  the  luminance  of  the  background  is  the  firing  rate 
of  an  object.  This  can  be  computed  by  adding  1  to  the  contrast 
computation.  (The  value  of  1  represents  neurally  the  firing  rate 
of  the  background  expressed  in  terms  of  the  backgroxind  firing  rate 
as  the  unit  of  measurement.)  Thus,  the  luminance  ratio  determining 
perceived  lightness  can  be  computed  neurally  from  the  firing  rate 
signaling  the  luminance  difference  between  a  square  and  the 
background  by  adding  the  firing  rate  of  the  background  and 
comparing  it  to  the  background  firing  rate.  This  calculation 
assumes  that  there  are  cells  in  the  visual  system  that  are  not  zero 
balanced,  i.e.,  the  excitatory  and  inhibitory  responses  do  not 
cancel  each  other.  In  a  uniform  field,  a  Ganzfeld,  such  cells 
respond  with  a  maintained  discharge. 

We  recognize  that  our  conjecture  is  highly  speculative. 
However,  we  find  it  suggestive  and  hope  that  by  pursing  it  we  will 
better  come  to  understand  how  perceived  lightness  is  related  to  the 
outputs  of  spatial- frequency  channels. 

2.8  Global  popout 

We  h^ve  been  exeunining  the  global  popout  of  lines  and  curves. 
By  global  popout,  we  mean  the  rapid,  preattentive  perception  of  a 
global  configuration  such  as  the  dotted  line  in  Figure  28. 
Treisman  &  Gormican  (1988)  have  done  pioneering  work  in  popout  but 
she  has  concentrated  on  looking  for  a  single  target  amidst  many 
distractors.  She  has  shown  that  a  single  feature  can  be  detected 
preattentively  and  in  parallel  while  a  combination  of  features 
requires  serial  search  and  focussed  attention. 


FIGURE  28 


2.8.1  Comparison  of  texture  segregation  and  the  popout  of  lines 

Beck,  Rosenfeld,  Ivry,  &  Navab  (1988)  have  studied  the 
factors  affecting  the  rapid  detection  of  a  line  composed  of 
disconnected  shapes  embedded  In  a  background  of  the  same  shapes. 
Figure  28,  shows  a  dotted  line  embedded  In  a  background  of  dots. 
The  line  Is  seen  Immediately  and  effortlessly.  It  Is  possible  that 
the  spatial  frequency  mechanism  Involved  In  the  discrimination  of 
texture  regions  also  detect  the  line.  For  example,  differences  In 
the  outputs  of  filters  In  the  line  region  and  In  neighboring 
regions  above  and  below  the  line  may  control  the  rapidness  of  line 
detection. 

This  kind  of  computation  does  not  appear  able  to  account  for 
the  detection  of  the  lines.  We  have  examined  whether  the  filtered 
output  in  a  lo  pixel  strip  about  the  line  differs  from  the 
filtered  output  in  the  background  (40  pixel  strips  above  and  below 
the  line  region)  for  the  data  reported  by  Smits,  Vos  and  Oeffelen 
(1985).  Figure  29  shows  the  strip  about  the  line  and  the  top  and 
bottom  background  strips  superimposed  over  a  filtering  of  the 
pattern.  There  was  no  significant  correlation  between  differences 
In  either  the  means,  standard  deviations,  or  maximum  outputs  of 
the  filters  in  the  line  and  in  the  background  strips,  and  the 
rapidity  with  which  a  line  was  detected.  The  maximum  Spearman 
rank  correlation  was  .21  when  we  summed  the  differences  across  all 
filters  and  .30  for  individual  filters. 


2.8.2  Popout  experiments 

Why  do  spatial  frequency  channels  predict  texture  segregation 
but  fail  to  predict  line  detection?  Before  turning  to  this 
question  I  want  to  report  three  studies  that  we  have  conducted. 

In  one  experiment,  we  Investigated  how  the  alignment  and 
misalignment  of  edges  affected  line  detection  Four  sizes  of  squares 
were  investigated.  The  stimuli  were  scalings  of  one  another. 
Square  size,  the  spacing  between  the  squares  in  the  line,  and  the 
lateral  displacement  of  the  misaligned  squares  were  Increased 
proportionally.  Figure  30  shows  examples  of  stimuli  with  aligned 
and  misaligned  squares.  Stimuli  were  flashed  for  150  msec  and 
subjects  were  required  to  judge  whether  the  line  was  vertical  or 
horizontal.  We  recorded  both  reaction  time  and  errors.  The  two 
measures  agreed  closely  and  we  shall  report  only  reaction  times. 

The  X  axis  in  Figure  31  plots  the  square  sizes  in  pixels  and 
the  Y  axis  the  mean  reaction  times.  There  is  a  striking  difference 
between  the  collinear  and  misaligned  squares.  For  the  misaligned 
squares,  reaction  times  remained  constant  with  increasing  square 
size  and  scaling  of  the  stimuli.  If  the  visual  system  is  detecting 
the  density  of  squares  in  a  pairticular  direction,  reaction  time 
would  be  expected  to  remain  constant  since  density  remains  constant 
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if  size  and  spacing  is  increased  proportionally.  For  the  aligned 
squares,  the  reaction  times  decreased  with  increasing  square  size 
and  scaling  of  the  stimuli.  Increased  edge  alignment  facilitated 
detecting  the  line. 

Two  further  experiments  demonstrate  the  importance  of  edge 
alignment.  The  stimuli  were  composed  of  8  and  16  pixels  on  a  side 
and  were  scalings  of  one  another.  Figure  32  shows  the  reaction 
times.  As  in  the  previous  experiment,  the  larger  squares  were 
detected  more  rapidly  and  than  the  smaller  squares.  This  occurred 
at  all  luminance  levels  even  when  the  areal  contrast  (area  x 
contrast)  of  the  smaller  square  was  much  greater  than  the  areal 
contrast  of  the  larger  square.  The  length  of  the  squares  was  more 
important  than  the  areal  contrast  of  the  squares  for  detecting  the 
line.  A  third  experiment  compared  solid  squares  with  outline 
squares.  The  squares  were  10  and  20  pixels  on  a  side  and  were 
scalings  on  one  another.  Figure  33  shows  that  the  outline  squares 
were  detected  as  rapidly  as  the  solid  squares.  Filtering  the 
displays  showed  that  the  difference  between  the  filtered  output  of 
the  line  region  and  the  outputs  of  the  background  region  was  for 
the  outline  squares  60  percent  of  that  of  the  solid  squares. 

2.8.3  Difference  between  texture  segregation  and  line  detection 

What  is  the  difference  between  the  texture  patterns  and  the 
line  displays?  For  preattentive  pattern  vision  such  as  immediate 
effortless  texture  segregation  and  line  detection,  we  believe 
perception  is  a  direct  function  of  lower->level  analyzers.  We 
propose  that  preattentive  pattern  vision  is  a  function  of  the 
information  from  3  types  of  analyzers:  bar  detectors,  spot 
detectors,  and  edge  detectors.  The  large  bar  detectors  provide 
information  eUsout  the  overall  changes  in  luminance.  The  spot 
detectors  and  the  small  bar  detectors  give  information  for  the  size 
and  orientation  of  the  elements  of  a  pattern.  The  edge  detectors 
give  information  for  the  arremgement  of  the  edges  in  a  pattern. 

The  spot  and  edge  detectors  provide  no  information  for 
segregating  the  texture  regions  in  Figure  1.  The  spot  detectors 
tell  us  that  there  are  two  populations — large  and  small  squares. 
There  is  , however,  no  spatial  differentiation  as  a  result  of  their 
outputs.  The  centroids  of  the  populations  of  large  and  small 
squares  are  the  seune.  There  is  also  no  information  from  the  edge 
detectors.  Though  there  is  more  alignment  of  the  squares  in  the 
top  region  them  in  the  center  region,  there  is  a  strong  alignment 
signal  coming  from  both  regions.  The  only  spatial  differentiation 
is  from  the  large  bar  detectors  which  signal  differences  in  the 
overall  pattern  of  Iviminance  changes  in  the  top  and  center  regions. 
In  the  top  region  the  changes  of  overall  Iximinance  occur  in  the 
direction  of  the  X  axis  and  in  the  center  region  in  a  direction  45 
degrees  to  the  X  axis.  The  large  bar  detectors  are  not  sensitive 
to  edge  alignment  and  we  have  fovmd  that,  unlike  the  line  displays, 
perceived  segregation  in  our  texture  displays  is  not  affected  by 
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either  the  absence  of  aligned  edges  or  the  misalignment  of  edges. 

The  long  line  in  the  line  displays  consists  of  elements  having 
the  same  size,  contrast,  and  orientation  as  the  background 
elements.  Line  detection  can  not  be  explained  in  terms  of 
differences  in  the  response  of  large  spot  or  bar  detectors.  The 
line  does  not  occupy  a  separate  region  of  the  display.  As 
Illustrated  in  Figure  34,  the  bar  detectors  become  wider  as  they 
become  longer  and  a  detector  long  enough  to  fall  on  3  dots  of  the 
line  (upper  right  image)  will  also  fall  on  many  background  dots. 
The  small  bar  detectors  spot  detectors,  and  edge  detectors  give 
similar  responses  to  elements  in  the  line  and  in  the  background. 
What  is  suggested  is  that  line  detection  is  not  the  result  of 
differences  in  the  outputs  of  large  spot  or  bar  detectors 

(detectors  falling  on  3  or  more  elements  of  the  line)  but  results 
from  local  operations  on  the  outputs  of  small  detectors.  The 
relevant  properties  of  the  outputs  of  spot  detectors  are  color, 
contrast,  size  and  possibly  sign  of  contrast.  The  outputs  of  bar 
detectors  have  in  addition  the  properties  of  orientation  and 

elongatedness.  The  outputs  of  edge  mechanisms  are  oriented  edge 
segments.  The  local  operations  that  detect  a  line  may  be  of 

various  kinds.  There  may  be  a  linking  of  similar  outputs.  Aligned 
horizontal  squares  may  link  to  form  a  long  line.  The  length  of  the 
long  line  is  an  emergent  feature  that  makes  it  stand  out  from  the 
surrounding  region.  Local  operations  may  also  direct  a  fast  search 
process.  Search  is  focused  by  straight  edges.  Short  edges, 
however,  do  not  focus  search  as  well  as  long  edges.  For  misaligned 
elements  a  sharp  focus  does  not  help.  Rapid  line  detection  occurs 
because  search  is  focussed  by  local  operators  that  compute  a 

greater  element  density  in  a  given  direction. 

We  are  investigating  these  possibilities. 
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Principal  Investigator  Jacob  Beck 

Senior  Research  Investigator:  Nonna  Graham 

Graduate  students:  Richard  Ivry,  Nassir  Navab,  Anne  Sutter,  James 
Tanaka 

Undergraduate  students:  Kirk  Helms,  Oorte  Poulsen 

5 .  MEETINGS  (reporting  AFOSR  research,  1985*1988) 

Third  Human  and  Machine  Vision  Workshop,  Boston, Massachusetts, 
1985,  "Spatial  filtering,  features, and  grouping  in  texture 
segregation. " 

Eleventh  Annual  Interdisciplinary  Conference,  Whistler,  British 
Columbia,  1986,  "Spatial  filtering,  features,  and  grouping  in 
texture  segregation." 

Center  for  Visual  Science  Symposium  on  Computational  Models  in 
Human  Vision,  Rochester,  New  York,  1986,  "Unresolved  issues  in 
texture  segregation." 

First  International  Conference  on  Computer  vision,  London, 
England,  1987,  "Visual  image  processing  in  texture  segregation." 

Optical  Society  of  America  Meeting  on  Color  Appearance,  Annapolis, 
Maryland,  1J87,  "On  the  role  of  figural  organization  in  perceptual 
transparency . " 

Meetings  of  the  Psychonomic  Society,  New  Orleans,  Lousiana, 
November,  1987,  "On  the  role  of  figural  organization  in 
perceptual  transparency" . 

First  Annual  Workshop  on  Goals  for  Machine  Vision,  U.C.L.A.,  James 
E.  West  Center,  Los  Angeles,  California,  1987,  "Human  and  machine 
image  processing  in  texture  segregation." 

AFOSR  Meeting  on  Visual  Information  Processing,  Annapolis, 

Maryland,  1987,  "Visual  image  processing  in  texture 

segregation. " 

Colloquia:  University  of  Maryland  (Department  of 

Psychology,  1986)  Boston  University  (Center  for  Adaptive 
Studies, 1986) , Hebrew  University  (Department  of  Computer  Science, 
1987,1988) ,  University  of  Haifa  (Department  of  Psychology,  1987)  , 
University  of  Pennsylvania  (Department  of  Computer  Science, 
1987) ,  Rutgers  University  (Department  of  Psychology,  1987) , 
National  Institutes  of  Health  (Department  of  Applied  Mathematics, 


32 

1988) ,  University  of  Massachusetts  (Department  of  Computer 
Science,  1988) 


33 


6.  REFERENCES 

Beck,  J.  1972  similarity  grouping  and  peripheral  discriminability 
under  uncertainty.  American  Journal  of  Psychology ,  85,  1-19. 

Beck,  J.  1982.  Textural  segmentation.  In  Organization  and  Representation 
in  Perception.  In  J.  Beck  Ed.  Hillsdale,  N.  J. :  Erlbaum,  285-317. 

Beck,  J.  1983.  Textural  segmentation,  second-order  statistics, 
and  textural  elements.  Biological  Cybernetics ,  48,  125-130. 

Beck,  J.,  Prazdny,  K. ,  &  Rosenfeld,  A.  1983.  A  theory  of  textural 
segmentation.  In  Human  arui  Machine  Vision  J.  Beck,  B.  Hope,  & 
A. Rosenfeld,  Eds.  New  York:  Academic,  l\-38. 

Beck,  J.,  Sutter,  A.  &  Ivry,  R.  1987.  Spatial  frequency  channels 
and  perceptual  grouping  in  texture  segregation.  Computer  Vision, 

Graphics,  arui  Image  Processing ,  37,  299-325. 

Beck,  J.,  Sutter,  A.  &  Graheun,  N.  1988.  Comparison  of  perceived 
segregation  and  perceived  lightness  in  texture  segregation.  In 
preparation. 

Bergen,  J.  R.  &  Adelson,  E.  H.  1988.  Early  vision  and  texture 
perception.  Nature,  333,  363-364. 

Caelli,  T.  1982.  On  discriminating  visual  textures  and  images. 
Perception  <&  Psychophysics ,  31,  149-159. 

Caelli,  T.  1985.  Three  processing  characteristics  of  visual 
texture  segregation.  Spatial  Vision ,  1,  19-30. 

Chubb,  c.  &  Sperling, 6.  1988.  Processing  stages  in  non-Fourier 

motion  perception.  Supplement  to  Investigative  Ophthalmology  and 
Visual  Science,  29,  266. 

Oaugman,  J.G.  1985.  Uncertainty  relation  for  resolution  in  space, 
spatial,  frequency,  and  orientation,  optimized  by  two 
dimensional  visual  cortical  filters.  Journal  of  the  Optical  Society 
of  America  A,  2,  1160-1169. 

Daugman,  J.  C.  1987.  Image  analysis  and  compact  coding  by  cciaibad 
2D  Gabor  Primitives,  S.P.I.E.  Proceedings,  758,  19-30. 

Gagalowicz,  1981  A  new  method  for  texture  field  synthasis: 
applications  to  the  study  of  human  vision.  IEEE  Transactions  on 
Pattern  Analysis  and  Machine  Intelligt;  ■'e,  3,  520-533. 


34 


Ginsburg,  A.P.  1984  Visual  form  perception  based  on  biological 
filtering.  In  Sensory  Experience,  Adaptation,  and  Perceptioru  L.  Spillman  & 
B.R.  Wooten,  Eds  Hillsdale  N.  J. :  Erlbaum,  53-72. 

Graham,  N.  1980.  Spatial  frequency  channels  in  human  vision: 
detecting  edges  without  edge-detectors.  In  Visual  coding  and 
adaptability.  C.S.  Harris,  Ed.  Hillsdale  N.J.:  Erlbaum.  215-262. 

Gredieun,  N.  1981.  Psychophysics  of  spatial  frequency  channels.  In 
Perceptual  Organization .  M.  Kiibovy  &  J.  R.  Pomerantz  Eds.  Hillsdale  N. 
J.:  Erlbaum,  53-72. 

Graham,  N.  1985  Detection  and  identification  of  near-threshold 
visual  patterns .  Journal  of  the  Optical  Society  of  America  A,  2 ,  1468-1482 . 

Graham,  N.  1988.  Low-level  visual  processes  and  texture 
segregation,  Pysica  Scripta ,  In  press. 

Graham,  N.,  Sutter,  A.  &  Beck,  1988.  J.  Contrast  and  spatial 
variedsles  in  texture  segregation:  suggesting  a  complex  spatial 
frequency  model.  In  preparation. 

Grossberg,  S.  1987.  Cortical  dyn2unics  of  three-dimensional  form, 
color,  and  brightness  perception:  monocular  theory.  Perception  & 
Psychophysics,  41,  87-116. 

Grossberg,  S.  &  Mingolla,  E.  1985.  Neural  dynamics  of  perceptual 
grouping:  textures,  boundaries,  and  emergent  feeLtuTBs. Perception  dc 
Psychophysics ,  38 ,  141-171. 

Helson,  H.  1964.  Adaptation  Level  Theory .  New  York:  Harper  and  Row. 

Hochstein,  S.  &  Spitzer,  H.  1985.  One,  few,  infinity:  linear  and 
nonlinear  processing  in  the  visual  cortex.  In  Models  of  the  Visual 
Cortex.  D.  Rose  &  V.G.  Dobson  Eds.  New  York:  Wiley,  341-350. 

Judd,  D.B.  &  Wyszecki,  G.W.  1963.  Color  in  Science,  Business,  and  Industry. 
New  York:  Wiley. 

Julesz,  B.  1975  Experiments  in  the  visual  perception  of  texture. 
Scientific  American ,  232 ,  34-43 . 

Julesz,  B.  1981  Textons,  the  elements  of  texture  perception  and 
their  interactions.  Nature,  290,  91-97. 

Klein,  A.  &  Tyler,  C.W.  1986.  Phase  discrimination  of  compound 
gratings:  generalized  autocorrelation  analysis.  Journal  of  the  Optical 
Society  of  America  A,  3,  868-879. 


35 


Marr,  D.  1976.  Early  processing  of  viual  information.  Philosophical 
TYansactions  of  the  Royal  Society ,  London  B275,  483-524. 

Shapley,  R.  &  Enroth-Cugell,  C.  1985  Visual  adaptation  and  retinal 
gain  controls.  In  Progress  in  Retinal  Research  Volume  3  N.N.  Osborne  & 
G.J.  Chader  Eds.,  New  York:  Pergamon,  263-346. 

Spitzer,  H.  &  Hochstein,  S.  1985a.  Simple-and  complex-cell 
response  dependences  on  stimulation  parameters.  Journal  of 
Neurophysiology,  53,  1244-1265. 

Spitzer,  H.  &  Hochstein,  S.  1985b.  A  complex-cell  reptive-field 
model.  Journal  of  Neuropl^siology ,  53,  1266-1286. 

Sutter,  A.  1987.  The  Interaction  of  size  and  contrast  in  perceived 
segregaton:  a  spatial  frequency  analysis.  Ph.D.  Thesis, 
University  of  Oregon,  Eugene,  Oregon. 

Sutter,  A.,  Beck,  J.,  &  Gradiam,  N.  1988.  Contrast  and  spatial 

varieUales  in  texture  segregation:  testing  a  simple  spatial 
frequency  channels  model.  In  preparation. 

Treisman,  A.  &  Gormican,  S.  1988.  Feature  analysis  in  early 

vision:  evidence  from  search  asymmetries  Psychological  Review ,  95,  15- 
46. 


Turner,  M.R.  1986.  Texture  discrimination  by  Ged^or  functions. 
Biological  Cybernetics ,  55 ,  71-82. 

Victor,  J.  D.  &  Brodie,  S.E.  1978.  Discriminable  textures  with 
identical  Buffon-needle  statistics.  Biological  Cybernetics ,  31,  231-234. 


Watson,  A.B.  1983.  Detection  and  recognition  of  simple  spatial 
forms.  In  Physiological  and  Biological  Preprocessing  ofimages .  O.J.  Braddick 
&  A.C.  Sleigh,  Eds.  New  York:  Springer-Verlag,  110-114. 

Wertheimer,  M.  1923.  Untersuchungen  zur  Lehre  von  Der  Gestalt  II. 
Psychologische  Forschung ,  4,  301-350. 


PART  II  (Stevens) 


CONTENTS 

1.  INTRODUCTION 

1.1  The  Problem  Areas 

1.2  Summary  of  Current  Woricing  Hypotheses 

2.  RESEARCH  RESULTS 

2.1  Parsing  Luminance  Changes 

2.2  The  Caf6  Wall  as  Evidence  for  Parsing 

2.3  Asserting  Orientation  between  Discrete  Items 

2.4  Connecting  Contour  Fragments  across  Gaps 

3.  REFERENCES 

4.  PUBLICATIONS,  MEETINGS  AND  PERSONNEL 


3 

3 

3 

8 

8 

13 

26 

34 

38 

45 


2 


ABSTRACT 


We  have  conducted  research  into  the  perception  of  texture,  concentrating  on  the  earliest  levels  in  the 
extraction  of  geometric  structure.  The  work  has  involved  a  computational  and  psychophysical  study  of  the 
role  of  retinal  and  cortical  spatial  frequency  filters  in  the  extraction  of  contour  information.  The  specific 
areas  reported  concern:  i)  the  differential  roles  of  radially-symmetric  and  elongated  receptive  Helds  on  the 
Caf6  wall  illusion,  a  pattern  that  is  useful  for  the  induction  of  illusory  brightness  bands  and  orientation,  ii) 
a  strategy  for  parsing  of  band-pass  filtered  images  to  differentiate  line-like  versus  edge-like  luminance 
changes,  iii)  asserting  orientation  between  discrete  items,  and  iv)  connecting  contour  fragments  across 
luminance  g^s.  Across  these  areas  one  common  theme  is  the  importance  of  spatial  gating  nonlinearity. 
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1.  INTRODUCTION 

1.1  The  Problem  Areas 

The  overall  goal  of  our  research  is  to  understand  the  visual  representations  and  processes  involved  in  texture 
perception.  Towards  that  end  we  have  recently  investigated  some  early  aspects  of  contour  and  form 
perception.  We  first  sketch  the  path  of  investigation  that  has  lead  us  to  these  areas.  Earlier  we  examined 
problems  of  bootstrapping,  i.e.  of  initiating  the  extraction  of  texture  descriptions  (Stevens  &  Brookes 
1987).  The  bootstrapping  study  was  a  direct  extension  of  (Stevens  1978)  where  parallelism,  a  primitive 
geometric  property  of  texture,  was  examined  in  terms  of  dot  patterns.  We  pursued  the  basic  question  of 
whether  orientation  (defined  e.g.  by  a  pair  of  adjacent  dots)  is  extracted  by  "symbolic"  virtual  lines  or  more 
directly  measured  by  elongated  receptive  fields  that  summate  luminance  energy.  As  recognized  by  the 
Gestalt  investigators  and  lata  formulated  computationally  (Mart  1976,  1982),  at  some  point  perceptual 
grouping  constitutes  associations  (such  as  pairings)  between  perceptual  wholes.  For  example,  virtual  lines 
might  represent  pairwise  groupings  between  similar  place  tokens.  It  would  be  significant  to  show  that  such 
grouping  processes  occur  at  an  early  stage  of  texture  processing.  Evidence  was  presented  that  supports  the 
virtual  line  hypothesis  (discussed  in  detail  later).  In  part  the  argument  hinged  on  a  broad  assumption  about 
the  linearity  with  which  luminance  signals  are  spatially  summated,  which  on  close  examination  led  to  our 
returning  to  question  the  symbolic  grouping  hypothesis  for  early  texture  processing.  As  we  report  in 
section  2.3,  while  some  grouping  phenomena  (such  as  a  preference  for  color  and  contrast  similarity)  still 
suggest  symbolic  pairings,  receptive  field  mechanisms  are  strongly  implicated  as  the  underlying  orientation 
detection  mechanism.  At  about  that  time  we  also  were  pursuing  the  role  of  the  concave  cusp  in 
determining  local  geometric  evidence  for  form  boundaries  (Stevens  &  Brookes  1988;  reported  first  in  the 
previous  final  report).  As  part  of  the  extension  of  that  work,  we  sought  to  examine  their  occurrence  in 
natural  images,  which  entailed  implementing  an  edge  detection  operator  that  differentiated  edge-  from  line- 
like  luminance  changes.  Consideration  of  extensions  of  Watt  and  Morgan's  (1985)  one-dimensional  theory 
to  two-dimensional  images  led  to  work  reported  in  section  2.1.  Experience  with  actual  images  led  to 
developing  a  parsing  operator  that  incorporated  strong  nonlinearities.  This  behavior  was  then  recognized  to 
have  a  possible  counterpart  in  the  nonlinearity  exhibited  by  cortical  cells,  for  which  we  suggest  a  functional 
purpose.  Both  themes,  the  distinction  of  luminance  changes  of  different  type  and  the  use  of  spatial 
nonlinearity  were  also  pursued  in  a  study  of  the  familiar  Caf6  wall  illusion  (section  2.2).  The  final  topic 
we  review  is  the  characterization  of  form  extraction  as  the  compilation  of  local  contour  evidence.  This 
work,  which  is  in  an  early  stage,  is  also  found  to  implicate  receptive  fields  (section  2.4). 
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1.2  Summary  of  Current  Working  Hypotheses 

The  work  we  report  is  performed  within  a  larger  view  towards  texture  and  form  perception,  which  is 
outlined  in  the  following  discussion. 

Form  Perception  as  Construction:  A  visual  form  is  a  perceptually  coherent  whole  that  is  distinguished  as 
figure  against  its  immediate  background.  A  major  aspect  of  the  extraction  of  visual  form  involves 
localizing  and  organizing  its  outline  or  silhouette  contours.  It  is  well  accepted  that  the  extraction  of  visual 
form  begins  with  the  detection  of  local  edge  or  contour  information  along  the  boundary,  and  concludes  with 
the  construction  of  a  distinct  perceptual  entity.  The  Gestalt  investigators  made  a  series  of  fundamental 
observations  that  together  establish  the  essentially  constructive  aspect  of  form  perception:  forming 
associadons  among  the  component  parts  of  a  visual  form.  The  associadons  or  groupings  seemingly  involve 
some  type  of  conjoining  primidves  (somedmes  referred  to  as  a  perceptual  glue^)  and  rules  for  how 
perceptual  parts  can  be  conjoined,  and  at  least  some  general  rules  for  the  applicadon  of  these  rules 
(instrucdons  for  assembly). 

Preattendve  versus  Attendve  Processing:  The  construcdon  of  a  form  is  achieved  substandally  in  parallel 
and  preattendvely.  To  determine  the  correct  closure  or  connecdvity  (e.g.,  across  substantial  gaps  or  contrast 
reversals)  is  an  intrinsically  difficult  computadon  and  probably  requires  sequenual  processing,  and  hence 
focal  attendon.  Natural  vision  requires  assembling  a  form  with  little  prior  knowledge  of  what  that  form 
might  correspond  to  physically,  hence  much  of  the  initial  aspects  of  the  assembly  process  must  be  achieved 
in  a  bottom-up  manner.  One  aspect  of  form  percepdon  that  we  find  particularly  challenging  is  how  the 
process  is  inidated. 

We  suspect  that  many  researchers  have  underestimated  considerably  the  combinatorics  of  possible 
forms  posed  by  a  natural  image.  An  image  presents  a  very  large  number  of  different  forms  that  might  be 
constructed  purely  on  the  basis  of  an  u  priori  reasonable  scheme  of  rules  for  generadng  wholes  from  local 
collecdons  of  fragments.  We  believe  that  actually  vay  little  form  percepdon  is  performed  beyond  a  very 
modest  inidal  stage  of  organizadon,  and  that  the  complexity  of  theoredcally-possible  forms  that  might  be 
extracted  from  an  image  poses  an  intrinsically  insurmountable  complexity  barrier,  one  that  tacitly  defines 
the  distinction  between  preattentive  and  attentive,  or  scrutinous,  vision.  That  is,  the  visual  system  uses 
attendve  processing  (e.g.  intrinsically-sequendal  computadons)  for  form  detecdon  problems  that  cannot  be 


^  Some  researchers,  notably  Treisman  and  Julesz,  propose  that  this  glue  requires  attentive  processing.  We 
suggest  that  the  attentive  aspects  they  address  experimentally  concern  search  and  discrimination  processsing 
that  is  more  difficult  and  performed  later  than  the  early  aspects  of  form  perception  we  address  here. 
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solved  preaitentively.  Not  all  of  form  perception,  however,  requires  focal  attention:  isolated,  closed  forms 
can  be  extracted  immediately  and  preattendvely. 

The  interplay  between  psvchophvsics  and  computational  models:  The  specific  problems  we  address  concern 
the  earliest  precursors  of  visual  form,  and  include  the  binding  togetho',  or  conjoining,  of  contiguous  (or 
nearly  contiguous)  contour  fragments,  selecting  particular  structures  of  contours  as  Hgure  versus  ground, 
and  the  subsequent  perception  of  flgural  properties  such  as  location  and  overall  orientation.  Each  of  these 
problems  can  be  treated  at  various  levels  of  specificity:  from  the  understanding  the  basic  visual  strategies  to 
providing  progressively  detailed  proposals  regarding  the  underlying  neural  mechanisms.  Most  frequently, 
we  find  ourselves  using  visual  phenomena  to  infer  basic  computational  issues,  such  as  how  orientation  is 
imposed  on  discrete  visual  items,  or  how  observers  preattentively  decide  which  side  of  a  contour  is  the  more 
likely  associated  with  figure  versus  ground.  What  has  been  more  difficult,  but  is  becoming  increasingly 
tractable,  is  the  mapping  between  strategies  and  implementations,  i.e.  demonstrating  the  involvement  of 
specific  neural  processes.  These  notions  are  sketched  in  the  following,  and  pursued  in  more  detail  in 
subsequent  sections. 

Observable  Behavior  and  Neurophysiology:  Understanding  the  internal  representation  of  a  perceptual 
grouping,  even  that  corresponding  to  the  simple  perceived  grouping  of  two  adjacent  dots  into  an  apparent 
pairing,  is  in  fact  an  extraordinarily  challenging  research  issue.  What  internal  quantities  might  we  assume 
underlie  the  dot  pairing?  First,  and  most  uncontroversially,  the  dot  pair  has  an  apparent  orientation 
corresponding  to  that  of  the  line  segment  connecting  the  two  dots.  The  orientation  of  the  dot  pair,  as  we 
will  return  to  discuss  in  detail,  is  likely  measured  by  orientation-selective  receptive  fields  (RF's).  While 
that  is  probably  unsurprising,  it  has  proven  quite  difficult  to  attribute  other  aspects  of  percepuial  grouping 
behavior,  such  as  similarity  preferences,  to  local  receptive  Helds.  Although  it  is  feasible  to  conclude  that 
orientation  is  measured  by  a  specific  receptive  field  mechanism,  it  is  quite  another  matter  to  understand  how 
those  measurements  are  selected  and  used  to  extract  a  visual  appreciation  for  the  structure  implicit  in  the 
pattern.  There  are  as  yet  hard  limits  on  the  feasibility  of  relating  perceptual  grouping  behavior  to 
underlying  neural  mechanisms.  Nonetheless,  the  known  neurophysiology  can  be  used  to  constrain  our 
computational  proposals,  as  discussed  below. 

Discrete  (and  Symbolic)  versus  Continuous  (and  Analogical)  Processing:  How  might  the  visual  system 
achieve  the  effect  of  perceptual  grouping?  The  association  of  fragments  into  a  perceptual  whole  does  not 
necessarily  require  discrete  representations  for  the  individual  fragments  and  a  subsequent  discrete  act  of 
generating  a  symbolic  grouping  assertion.  There  has  been  persistent  difficulty  in  deciding  upon  a  formal 
way  of  describing  perceptual  groupings,  i.e.  to  have  a  description  that  captures  the  salient  aspects  of  seeing 
discrete  items  as  associated  perceptually,  and  yet  is  biologically  implementable.  Mathematically,  a 
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grouping  among  n  items  might  be  characterized  formally  by  an  n-ary  relation  and  represented  by  an  n-tuple 
whose  referents  are  pointers  to  the  representations  of  the  constituent  fragments.  But  how  this  abstraction  is 
implemented  neurally  remains  obscure.  Nonetheless,  the  visual  system  shows  ample  evidence  of  being  able 
to  operate  selectively  on  subpopulations  of  visual  items,  pulling  out  of  a  complex  display  the  structures 
that  carry  biologically  meaningful  organization.  One  is  then  aware  of  the  discrete  items  that  comprise  the 
apparent  pattern  and  organization.  The  perceptual  groupings  are  likely  achieved  by  neural  mechanisms  that 
individually  (in  single-cell  recordings)  exhibit  graded  and  continuous  behavior.  How  they  affect  such  discrete 
behavior  in  concert  is  still  unknown. 

Nonlinearity  and  Visual  .Strategie.s:  The  extraction  of  a  visual  form  requires  discrimination  of  its  boundary 
from  the  background  (and  the  interior  details  of  the  form)  and  assembly  of  the  constituent  parts  into  a 
whole.  These  operations  are  intrinsically  nonlinear,  of  course,  and  presumably  are  implemented  by  gating 
(or  veto)  and  selection  nonlinearity  mechanisms.  Shunting  inhibition  within  a  feedback  network  can 
achieve  such  strongly  nonlinear  behavior,  as  well  as  more  continuous  forms  of  nonlinearity,  e.g.  in  gain 
control  and  signal-to-noise  enhancement  (McCulloch  &  Pitts  1943;  Grossberg  1982).  The  nonlinearity  in 
form  perception  is  usually  associated  with  later  stages,  e.g.  the  bistability  of  figure-ground  decisions. 
Moran  and  Desimone  (1983)  have  shown  evidence  of  gating  suppression  in  extrastriate  cortex  driven  by 
selective  attention  But  the  earliest  stages  of  contrast  processing  in  striate  cortex  are  generally  expected  to 
be  approximately  linear  (Hubei  &  Wiesel  1962, 1968;  Bishop,  Coombs  &  Henry  1971;  Bishop  &  Henry 
1972).  The  major  deviations  from  linearity  is  expected  in  length  and  width  summation  (Heggelund, 
Krekling  &  Skottun  1983;  Henry,  Goodwin  &  Bishop  1978;  Webster  &  DeValois  1985),  presumably  due 
to  a  Gaussian-shaped  weighting  envelope,  such  as  modelled  by  a  Gabor  filter  (Daugman  1985)  (see  also 
Movshon,  Thompson  &  Tolhurst  1978  regarding  superposition). 

There  is  evidence  for  gating  nonlinearity  as  early  as  simple  cells.  Hammond  and  MacKay  (1981, 
1983a)  lecendy  showed  that  length  summation  in  simple  cells  is  highly  dependent  on  contrast.  For  stimuli 
having  constant  contrast  along  their  lengths,  length  summation  was  found  to  be  substantially  linear. 
Additions  of  line  gaps  of  reduced  (background)  contrast  lowered  the  response  of  the  cell  by  an  amount 
predictable  by  length  summation  considerations.  However,  reduction  of  contrast  below  background 
(reversed  contrast)  produced  an  unpredictably  large  response  decrement.  The  term  "gating”  inhibition  was 
given  to  this  phenomenon  to  distinguish  it  from  simple  removal  of  excitatory  drive,  and  appears  to  be  a 
property  of  complex  cells  as  well  (Hammond  &  MacKay  1983b,  1985). 

There  would  be  computational  advantage  to  having  an  operator  act  as  a  linear  device  when 
presented  with  appropriate  stimuli,  and  be  shunted  when  presented  with  a  configuration  for  which  the 
measurement  would  be  meaningless.  We  are  examining  the  strategic  utility  of  this  principle  in  early 
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aspects  of  form  perception.  Selection,  or  enhancement,  nonlinearity,  on  the  other  hand,  is  likely  more 
generally  pervasive  throughout  early  visual  processing.  At  one  extreme,  selection  might  be  involved  in 
contour  detection  as  early  as  the  LGN,  and  at  the  oth»,  to  underlie  the  local  bistability  of  figure-ground 
decisions.  Grossbeig  and  his  colleagues  have  elucidated  the  computational  principles  of  nonlinear  systems 
and  shown  their  predictive  value  e.g.  in  boundary  contour  detection  and  brighmess  reconstruction  (Grossberg 
&  Mingolla  1985).  Specific  instances  of  nonlinear  behavior  are  extraordinarily  difTicult  to  describe, 
however,  because  the  observed  nonlinearities  reflect  the  purposes  of  the  computation,  which  must  first  be 
understood  prior  to  modelling  the  source  of  the  nonlinearity.  Our  approach  is  to  use  psychophysics  to 
further  reveal  the  fundamental  strategies  underlying  geometric  form  perception,  with  the  proviso  that 
understanding  the  implementation  of  these  strategies  is  a  secondary  concran. 


2.  RESEARCH  RESULTS 


2.1  Parsing  Luminance  Changes 

Two  prototypical  luminance  features  ate  the  step  edge  and  bar  (where  a  line  is  the  limiting  case  of  a  bar  of 
zero  width).  An  ideal  bar  in  an  image  would  corre^nd  to  pair  of  opposite-polarity  edges  in  very  close 
proximity.  Note  that  the  definition  of  a  bar  feature  in  the  image  presumes  a  given  scale.  As  bar  width 
increases  the  component  edges  of  the  bar  separate  until  for  some  arbitrary  width  dependent  upon  the  design 
of  the  visual  system,  the  two  edge  events  are  no  longer  labelled  as  coig>led,  and  the  bar  is  no  longer  seen  as 
a  unitary  image  event 

Psychophysically,  certain  perceptual  phenomena  such  as  the  Chevreul  illusion  and  the  Mach  band 
exhibit  a  strong  degree  of  scale  sensitivity,  which  suggests  that  bar-  and  line-like  luminance  changes  might 
well  be  implemented  by  spatially  organized  receptive  fields.  (Kulikowski  8l  King-Smith  1973  ;  Sh^ley  & 
Tolhurst  1973;  Daugman  1985).  The  neurophysiology  supports  the  general  distinction  of  bar  versus  edge 
in  terms  of  even-  versus  odd-symmetric  simple  cell  receptive  fields,  respectively.  The  receptive  field  of 
appropriate  orientation  measures  the  approximate  tangent  to  the  contour.  Wilson  (1986),  for  example, 
using  a  family  of  orientation-tuned  receptive  fields  of  different  sizes,  has  shown  how  a  variety  of  curvature- 
related  tasks  can  be  subserved  by  elongated  receptive  Field  (RF)  organizations  (see  also  Gelb  &Wilson  1983 
and  critique  in  Mcx-gan  &  Ward  1983). 

As  contour  curvature  is  further  increased  relative  to  the  overall  scale  of  the  luminance  change,  the 
prototypical  description  is  a  closed  blob,  a  convex  region  that  typically  corresponds  to  a  local  maximum  or 
minimum  in  the  luminance  field.  As  the  diameter  of  the  feature  is  further  reduced,  the  blob  becomes  point¬ 
like.  Unlike  the  apparent  ctHiespondence  between  elongated  receptive  fields  for  processing  bars  and  edges,  it 
is  not  clear  what  the  neural  counterparts  are  for  localizing  and  distinguishing  blob-like  luminance  changes. 
There  are  many  non-orientation  selective  cells  found  in  cortex,  such  as  those  in  the  so-called  anatomical 
blob  regions,  recently  revealed  by  cytochrome  oxidase  (CO)  labelling  in  VI  (Wong-Riley  1978;  Horton  & 
Hubei  1981;  Livingstone  &  Hubei  1984;  Hendrickson  1985;  Hubei  &  Livingstone  1987).  CO  resides  in 
blob-like  regions  that  are  found  in  regular  arrays  throughout  VI  and  is  absent  from  intervening  (interblob) 
regions.  The  blobs  lie  in  the  center  of  ocular  dominance  columns,  but  unlike  columns,  are  seen  only  in 
laminae  II-III  and  V-VI.  Current  speculation  is  that  these  structures  are  involved  in  color  but  not  form 
processing  (Livingstone  &  Hubei  1987,  Ts'o  et  al.  1986a). 
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Figure  1.  Two  locations  were  chosen  to  demonstrate  the  complex  cross-section  profiles  of  the  luminance 
signal  in  a  natural  image.  The  blue  line  with  the  white  dot  indicates  the  location  of  the  cross-section. 
These  locations  are  used  in  figure  2  to  illusuate  the  corresponding  convolution  values  after  DOG  filtering. 


As  Koenderink  (1984)  observes,  it  is  mathematically  possible  to  describe  an  image  purely  in  terms 
of  blobs  of  varying  scales  and  shapes  (see  also  Koenderink  &  van  Doom  1987).  This  leads  to  an  elegant, 
unified  treatment  for  describing  images,  but  we  are  pursuing  the  conjecture  that  the  visual  system 
specifically  distinguishes  within  this  general  scheme  a  small  set  of  scale-dependent  image  features.  Features 
such  as  edges  and  bars  would  have  associated  attributes  or  properties  such  as  location,  blur,  size.  In 
culminating  a  progression  of  studies  on  contour  curvature,  blur,  and  spatial  primitives.  Watt  and  Morgan 
(1985)  presented  a  theory  for  distinguishing  edge  versus  bar  luminance  waveforms  that  is  more  predictive  of 
human  performance  than  earlier  models  based  on  zero-crossings  (Marr  &  Hildreth  1980).  Watt  and 
Morgan's  model,  however,  is  difficult  to  implement  for  waveforms  that  have  substantial  sustained  activity 
(Stevens,  in  preparation).  Their  model  incorporates  an  internal  estimation  of  noise,  which,  when  subtracted 
from  the  waveform,  is  expected  to  produce  regions  of  zero  activity.  The  parsing  of  the  waveform  into  edges 
versus  bars  is  then  based  on  the  arrangements  of  regions  of  activity  bounded  by  regions  of  inactivity.  Our 
experience  with  natural  images,  however,  revealed  that  regions  of  nonzero  response  are  quite  common. 
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There  is  apparently  no  local  means  to  estimate  this  local  "noise”  (the  nonzero  signal  is  not  entirely  noise, 
but  also  a  consequence  of  nonlinear  shading  gradients  which  introduce  nonzero  second  partial  derivatives 
over  extended  regions).  Our  early  implementations  of  the  Watt  and  Morgan  parsing  strategy  had  only 
limited  success,  because  of  the  strict  requirement  f<M'  zero-bounded  regions.  We  further  recognized  that  the 
biological  signals  providing  the  input  to  such  parsing  can  be  expected  to  have  a  substantial,  and 
unpredictable,  sustained  component  that  is  roughly  related  to  absolute  luminance  (Barlow  &  Levick  1969; 
Marrocco  1972;  Stone  &  Fukuda  1974).  The  introduction  of  sustained  response  into  both  the  on-  and  off- 
center  channels  poses  a  major  difficulty  to  strategies  that  are  expecting  to  bound  regions  of  activity. 

Stevens  (in  preparation)  daived  a  new  model  for  the  component  activity  in  the  receptive  fields  that 
effectively  distinguish  edge  finm  bar,  which  does  not  require  returns  to  zero  activity.  Specifically,  the 
strategy  is  to  localize  a  region  of  relatively  enhanced  activity  in  the  on-cells  that  is  coincident  with  a 
relatively  depressed  region  of  activity  in  off-cells.  Fw  a  zero-balanced  signal  only  the  on-  or  the  off-  system 
would  be  expected  to  be  active.  An  ideal  edge  would  correspond  to  a  region  of  on-cell  activity  spatially 
adjacent  to  a  region  of  off-cell  activity.  The  regions  of  activity  would  lie  astride  the  zero-crossing  that 
marks  the  location  of  the  ideal  edge.  See  e.g.  (Glunder  1986)  for  a  suggested  neural  implementation  of 
Watt  and  Morgan's  (198S)  theory.  But  with  sustained  activity  one  can  expect  both  the  on  and  off  systems 
to  have  substantial  activity.  Hence  we  suggest  measuring  the  relative  proportion  of  activity  across  the  two 
systems  at  the  same  spatial  location  with  the  RF.  That  is,  while  the  cross-section  of  an  odd-symmetric 
(edge)  RF  is  traditionally  modelled  as  adjacent  on-center  and  off-center  subfields.  Our  extension  would 
expect  antagonistic  input  (inhibitory  input  of  opposite-polarity)  to  each  subfield.  As  a  first  Boolean 
approximation  to  an  implementation  that  would  eventually  use  algebraic  summation  and  veto  nonlinearity), 
an  edge  would  be  marked  by  a  region  of  not  merely  on-activity,  but  ON-and-not-OFF  activity  (meaning 
much  more  on-cell  activity  than  off)  spatially  adjacent  to  a  region  of  opposite  polarity,  namely  OFF-and- 
not-ON.  This  constitutes  a  superposition  of  mutual  antagonism  of  the  opposite  sign  channels  to  each 
subfield  of  the  overall  RF.  As  will  be  developed  further  in  later  sections,  this  corresponds  well  to  the 
observed  behavior  in  simple  and  complex  cells  found  by  Hammond  and  MacKay  (1981,  1983a,  1983b, 
1985). 


Receptive  fields  incorporating  spatial  adjunctions  of  this  type  were  implemented  in  a  fast 
algorithm  using  our  AFOSR-funded  digital  convolver  (32-pixel  diameter  kernel)  on  the  Symbolics  3675 
Lisp  Machine.  Using  natural  images,  we  examined  the  method’s  success  in  marking  line  and  edge  features 
(see  figures  1  thrjugh  3).  In  this  algorithmic  study,  various  curious  observations  were  made.  First,  the 
scheme  is  remarkably  resilient  over  spatial  imbalance  between  the  on  and  off  system  (as  would  arise  in  P- 
cells).  Second,  it  is  sufficient  to  examine  only  the  sign  of  the  convolution  values  (using  a  "trinarizing" 
method  that  differentiates  positive,  negative,  and  deadband  zones  of  activity,  where  the  deadband  is  typically 
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5-10%,  i.e.  on  the  order  of  the  sensitivity  expected  of  P-cells).  Third,  it  is  sufficient  to  use  a  small  spatial 
zone  of  coincidence  for  localizing  edges  and  bars;  the  precise  localization  of  central  moments  (as  Watt  and 
Morgan,  1985,  propose)  is  not  particularly  essential  for  the  localization  task.  Fourth,  we  found  that  spatial 
localization  in  two  dimensions  can  be  effectively  finessed  by  INCLUSIVE-ORing  the  detection  of  an  edge 
or  bar  in  four  or  so  independent  orientations.  That  is,  we  find  that  excellent  bar  and  line  positional  marking 
can  be  achieved  by  non-orierued  receptive  fields. 


Figure  2.  The  result  of  convolving  the  image  in  figure  1  with  a  DOG  having  ratio  of  excitatory  and 
inhibitory  space  constants  of  1:5.  Note  that  the  profiles  correspond  to  the  locations  used  in  figure  1.  The 
convolution  profile  represents  several  closely-spaced  edge  and  line  features. 


Another  observation  from  the  earlier  implementation  study  is  the  dramatic  reduction  in  complexity 
of  potential  forms  that  results  from  making  an  early  decomposition  of  the  image  into  separate  bar  and  edge 
descriptions.  Compared  e.g.  to  the  plethora  of  undifferentiated  zero-crossing  contours  resulting  from 
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convolution  with  a  small  EXXj  operator  (few  of  which  correspond  to  form  boundaries),  the  edge  segments 
seen  in  the  absence  of  the  lines  and  bars  make  for  a  much  more  readily  interpreted  collection  of  forms. 

We  find  it  significant  that  while  edge  detection  is  generally  expected  to  be  accomplished  by 
elongated  receptive  fields,  which  incorporate  orientation  selectivity,  it  is  possible  in  principle  to  dissociate 
the  problem  of  localizing  an  edge  (or  bar)  from  the  problem  of  computing  its  orientation,  blur,  contrast,  or 
other  attributes.  We  do  not  expect  that  the  biological  system  follows  this  strict  function  division,  i.e. 
using  separate  mechanisms  for  edge  localization  than  for  measuring  orientation,  and  so  forth.  Elongated 
RFs  are  likely  used  for  both  measuring  curvature  and  for  spatial  localization,  as  modeled  recendy  (Wilson 
1986;  Dobbins,  Zucker  &  Cynader  1987).  But  it  is  nonetheless  noteworthy  that  two-dimensional  detection 
of  edge  or  bar  location  can  be  finessed  with  radially-symmetric  operators.  The  neurophysiological 
prediction  would  be  the  detection  of  RF's  that  are  sharply  tuned  to  phase  (e.g.  even  versus  odd  symmetry) 
but  independent  of  the  orientation  of  the  bar  or  edge. 


Figure  3.  Dcmonsiraiion  of  our  linc/cdge  parser,  where  edges  and  lines  (thin  bars)  are  emphasized.  This 
output  was  generated  using  one  free  parameter,  namely  a  threshold  representing  contrast  ser  itivity  of  5%. 
The  implementation  u.scs  a  digital  convolver  for  the  DOC  convolution. 
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2.2  The  Caf4  Wall  as  Evidence  for  Parsing 

The  Caf6  Wall  is  composed  of  alternating  black  and  white  tiles  separated  by  narrow  horizontal  mortar  lines 
of  intermediate  grey  (figure  4).  Two  effects  are  associated  with  the  pattern.  The  horizontal  edges  of  the 
individual  tiles  appear  slightly  tilted,  causing  them  to  appear  wedge-shaped,  and  overall,  one  receives  the 
impression  that  the  mortar  lines  are  not  parallel. 

The  Caf6  Wall  effect  appears  to  involve  several  early  aspects  of  edge  detection:  irradiance, 
brightness  induction,  orientation  detection,  edge  localization,  and  contour  completion.  These  effects  derive 
from  phenomena  associated  with  the  mortar,  both  within  the  mortar  and  at  the  borders  between  the  mortar 
and  the  tiles  above  and  belo’.v.  The  horizontal  white  and  black  edges  that  border  on  the  mortar  line  appear  to 
intrude  diagonally  into  the  mortar,  producing  a  succession  of  wedge-shaped  tiles.  And  within  the  mortar 
itself,  rather  than  a  uniform  grey  one  sees  alternating  light  and  dark  diagonal  bands,  a  so-called  "twisted 
cord"  (Fraser  1908).  The  twisted  cord  has  been  shown  to  induce  an  illusion  of  overall  tilt  in  this  and  other 
patterns  (Fraser  1908),  presumably  from  interactions  among  orientation-tuned  cortical  units  that  construct 
extended  continuous  contours  (Moulden  &  Renshaw  1979;  Grossberg  <fe  Mingolla  1985). 


I 

I 
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Figure  4.  The  Cafd  Wall  pattern. 
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Gregory  and  Heard  (1979)  propose  that  the  intrusion  of  the  tiles  into  the  mortar  is  a  consequence  of 
the  failure  of  an  hypothesized  "border  locking"  process,  whose  purpose  is  to  constrain  the  spread  of 
lightness  to  within  regions  bounded  by  contrast  edges.  They  suggest  that  in  those  regions  where  tiles  of 
opposite  contrast  overlao  vertically,  the  border-locking  process  does  not  localize  the  edge  boundaries 
correctly;  lightness  then  migrates  across  the  mortar  to  form  the  illusion.  Gregory  and  Heard  (1979)  further 
suggest  a  relationship  between  edge  contour  shifts  and  irradiance,  which  Moulden  and  Renshaw  (1979)  also 
show  strongly  contributes  to  the  tilt  distortion  in  the  related  MUnsterberg  illusion.  McCourt  (1983)  has 
shown  that  brighmess  induction  also  contributes  to  the  effect  by  inducing  an  alternating  pattern  of  elongated 
diagonal  light  and  dark  strands  within  the  mortar. 

Morgan  and  Moulden  (1986)  examined  the  spatial  frequency  content  of  twisted  cords,  by 
convolving  the  Cafd  Wall  pattern  with  a  Laplacian  operator  that  resembles  the  center-surround  operator 
found  in  the  retina  (Rodieck  &  Stone  1965;  Wilson  &  Bergen  1979;  Marr  &  Hildreth  1980)  They  show 
that  Laplacian  filtering  retains,  and  even  accentuates,  the  apparent  twisted  cords  in  the  Caf6  Wall  and 
MUnsterberg  patterns.  The  extrema  (ridges  and  troughs)  in  the  Laplacian-convolved  image  correspond  quite 
directly  to  the  light  and  dark  strands  of  the  apparent  twisted  cord.  Foley  and  McCourt  (1985)  show  that 
such  center-surround  operators  can  induce  opposite-phase  brighmess  into  narrow  fields,  as  is  the  case  for  a 
difference  of  Gaussians  (DOG)  operator  of  diameter  somewhat  larger  than  the  mortar  width.  In  addition  to 
attributing  small-scale  brighmess  induction  effects  to  retinal  DOGs,  irradiance-type  shifts  in  apparent  edge 
location  along  the  mortar  (Moulden  &  Renshaw  1979;  Gregory  &  Heard  1979)  may  also  have  a  partly 
retinal  origin,  e.g.  by  a  compressive  transform  at  luminance  transduction  (Morgan,  Mather,  Moulden  & 
Watt  1984;  Mather  &  Morgan  1986).  The  tilt  effect  in  the  Cafd  Wall  seems  therefore  to  originate  in  part  as 
perturbations  or  artifacts  induced  by  the  spatial  filtering  performed  at  the  retina,  and  in  part  fiom  cortical 
processes  that  measure  the  orientation  and  position  of  these  perturbed  luminance  changes. 

In  addition  to  the  local  origins  of  tilt  in  the  pattern,  there  is  need  to  explain  the  overall  impression 
of  convergence  along  the  alternating  rows  of  tiles.  As  Fraser  (1908)  originally  conjectured,  the  global 
aspects  of  the  illusion  likely  emerge  from  integrative  interactions  along  the  length  of  the  mortar.  Support 
has  been  given  for  mis  conjecture,  phrased  in  terms  of  facilitatory  interactions  among  orientation-tuned 
cortical  units  mat  construct  extended  continuous  contours  (Moulden  &  Renshaw  1979;  Grossberg  & 
Mingolla  1985). 

We  examine  here  quantitatively  me  extent  to  which  fine-scale  Laplacian-like  (DOG)  filtering  at  me 
retina  induces  me  topographic  features  associated  wim  me  twisted  cords.  We  find  mat  the  smallest  proposed 
DOG  operator  makes  satisfactory  predictions  regarding  the  presence  and  extinction  of  induction-like  features 
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in  the  bandpass  image,  providing  further  support  for  the  ideas  described  by  Morgan  and  Moulden  (1986). 
We  report  a  reversal  of  the  traditional  Caf6  Wall  effect  that  is  dependent  upon  mortar  width. 

We  also  examine  the  involvement  of  orientation-selective  units  (e.g.  simple  cells)  in  extraction  of 
the  local  tilt.  Hammond  and  MacKay  (1981, 1983a,  1983b,  1985)  demonstrated  a  dramatically  non-linear 
property  of  striate  cells:  suppression  when  a  bar  is  augmented  with  a  small  dot  of  opposite  contrast.  The 
cell  appears  to  be  gated-off  by  the  addition  of  the  small  dot  of  opposite  contrast.  We  have  found  that  by 
inserting  points  of  opposite  contrast  at  strategic  locations  where  even-symmetric  receptive  fields  might  be 
expected  to  align  locally  with  the  ridges  or  troughs  in  the  Caf6  Wall  pattern,  there  is  a  corresponding 
disruption  of  the  tilt  illusion. 

The  Influence  of  Circular-Symmetric  Operators 

Morgan  and  Moulden  (1986)  observe  that  the  light  and  dark  bands  within  the  twisted  cord  correspond  to  the 
ridges  and  troughs  (local  extrema)  in  the  Laplacian-convolved  image.  Using  a  zero-balanced  difference  of 
Gaussians,  with  ratio  of  excitatory  to  inhibitory  space  constants  of  1:5  (or  larger),  we  examined  the 
behavior  of  the  extrema  in  the  convolution  values  as  a  function  of  the  relative  size  of  the  operator  and  the 
mortar  width. 

Gregory  and  Heard  (1979)  varied  mortar  width  and  found  that  the  illusory  tilt  was  maximal  at  the 
smallest  width  they  tested  (1 0.  nnd  that  it  weakened  with  increasing  mortar  width,  until  little  distortion  was 
observed  for  a  10'  width^.  We  find  that  this  behavior  is  likewise  reflected  in  the  amplitude  and  shape  of  the 
extrema  in  the  DOG-convolved  image  as  mortar  width  is  increased.  We  assume  that  the  smallest  DOG 
operator  in  the  central  fovea  has  a  central  excitatory  diameter  o)  of  about  1.3'  (Marr,  Poggio  &  Hildreth 
1980;  Richter  &  Ullman  1982)^  .  The  quantitative  behavior  of  a  DOG  operator  of  this  size  as  a  function 
of  mortar  width  suggests  to  us  that  this  scale  of  operator  governs  both  the  appearance,  and  the  gradual 
disappearance,  of  the  local  tilt  in  the  Caf6  Wall  patterns  seen  in  sharp  focus.  When  the  mortar  width  is  on 
the  order  of  co,  the  extrema  in  the  resulting  convolution  values  resemble  twisted  cords,  as  Morgan  and 
Moulden  (1986)  report.  The  convolution  values  within  the  mortar  are  also  modulated  periodically,  in 
accordance  with  the  brightness  induction  observed  by  McCourt  (1983).  Specifically,  while  the  mortar  is  a 

^  This  result  was  obtained  for  patterns  presented  in  sharp  focus;  the  illusion  can  also  be  seen  over  a  larger 
range  of  scales  when  blurred  optically  or  viewed  somewhat  peripherally  (Moulden  &  Renshaw  1979;  Gregory 
&  Heard  1979). 

^  Marr  et  al.  (1980)  include  the  optical  point  spread  function  in  the  effective  diameter  of  this  operator.  This 
proposed  operator  size  would  represent  the  smallest  expected  diameter  of  DOG  operator,  as  based  on  the 
density  and  diameter  of  cones  in  the  central  fovea,  and  presumes  that  the  excitatory  input  derives  from  a 
single  cone  (Richter  &  Ullman  1982). 


16 


unifonn  neutral  grey,  the  contrast  within  the  convolved  mortar  alternates  in  reverse  phase  relative  to  the 
contrast  of  the  bordering  tiles:  where  bounded  above  and  below  by  white  tiles  the  convolved  mortar  is 
darker,  and  vice  versa  (see  figure  4a).  For  mortar  of  width  to  or  slightly  larger,  the  convolved  values  within 
the  mortar  show  a  gradient  lilted  from  the  horizontal,  which,  as  noted,  corresponds  well  to  the  apparent 
twisted  cord.  But  as  mortar  width  further  increases,  the  DOG  receptive  fields  with  centers  that  fall  within 
the  mortar  receive  decreasing  contribution  to  their  surrounds  fiom  the  tiles  above  and  below  the  mortar.  The 
brightness  induction  effect  thus  diminishes  until,  for  sufficient  mortar  width,  the  tile/mortar  margins  are 
essentially  isolated  edges.  The  progressive  diminution  of  the  induced  twisted  cords  is  predicted  rather  well 
by  assuming  the  DOG  operator  has  an  m  of  about  1.3'.  An  operator  much  larger  would  have  preserved  the 
twisted  cord  beyond  the  limits  observed  by  Gregory  and  Heard  (1979).  This  is  further  quantified  by 
Experiment  1. 


Figure  3.  Two  example  stimuli  from  experiment  1,  to  be  viewed  at  normal  reading  distance  (i.e. 
approximately  50  cm)  with  central  fixation.  In  a  the  wedge  distortion  in  the  central  row  appears  to  narrow 
to  the  left.  In  b  the  mortar  is  wider  and  there  is  slight  impression  of  distortion  in  the  opposite  direction. 
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In  the  following  (Lulich  &  Stevens,  submitted)  we  will  describe  a  brightness-induction  effect  in 
the  Caf6  Wall  pattern  that  arises  with  large  mortar  widths.  One  can  contrive  to  have  the  bands  of  induced 
brightness  reverse  the  apparent  direction  of  tilt  in  the  Caf6  Wall.  The  reversal  effect  can  be  seen  in  figure 
5b.  Notice  that  in  the  small-width  pattern  (figure  5a)  the  mortar  lines  appear  to  deviate  from  the  horizontal, 
as  expected  in  the  traditional  Cait  Wall  effect  The  two  lines  do  not  appear  parallel,  but  rather,  to  define  a 
shallow  wedge  that  narrows  toward  the  left  In  figure  5b,  where  the  mortar  width  is  increased,  there  is  a 
slight  impression  of  a  wedge  pointing  in  the  opposite  direction,  that  is,  with  the  right  narrower  than  the 
left  We  examined  this  change  in  apparent  orientation  as  a  function  of  mortar  width. 

As  in  Morgan  and  Moulden's  (1986)  study,  this  reversal  effect  can  be  shown  to  be  objectively 
present  in  the  bandpass-filtered  image  (see  figure  7).  It  thus  lends  further  support  to  the  notion  that  the 
local  tilt  illusion  in  the  Caf6  Wall  is  introduced  largely  by  artifacts  of  retinal  processing.  But  furthermore, 
close  examination  of  the  topography  of  the  bandpass-filtered  image  raises  questions  concerning  the 
measurement  of  orientation,  as  distinct  from  spatial  localization,  of  luminance  changes. 

Experiment  1 

Method 

Subjects:  Three  subjects  took  part  in  the  experiment;  all  had  normal  or  corrected  to  normal  visual  acuity. 
These  subjects  had  participated  in  a  variety  of  experiments  in  our  laboratory  and  were  familiar  with  the 
techniques  and  methods  required.  All  were  naive  to  the  purposes  of  this  experiment. 

Stimuli:  Caf6  Wall  pauems  of  varying  mortar  width  were  generated  using  a  Sym  'jlics  3600  Lisp  Machine 
and  displayed  on  a  high-resolution  CRT  (Tektronix  634,  with  P45  phosphor  and  0.21  mm  spot  size).  The 
patterns  were  viewed  from  a  distance  of  203  cm  using  natural  pupils.  The  patterns  were  constructed  with 
varying  combinations  of  tile  size  and  mortar  width  (see  representative  stimuli  in  figure  5).  Two  tile  sizes 
vere  chosen,  7.1'  and  14.2'.  For  the  stimuli  based  on  the  smaller  tile  size  the  pattern  consisted  of  three 
rows  of  six  columns  of  alternating  black  and  white  tiles.  The  stimuli  made  up  of  the  larger  tile  size 
consisted  of  three  rows  of  four  columns.  The  patterns  were  embedded  in  a  background  grey  equal  to  the  grey 
of  the  mortar.  This  grey  was  balanced  to  one-half  the  luminance  difference  between  the  black  and  white 
tiles.  The  luminance  of  the  white  tiles  were  42  ft/L,  the  black  tiles  0.1  ft/L,  and  that  of  the  mortar  and 
background  was  22  ft/L.  For  each  size  of  tile,  stimuli  were  presented  at  one  of  ten  different  mortar  widths. 
Mortar  width  was  varied  from  0.35'  to  3.5'  in  increments  of  0.35'. 
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The  overall  height  of  each  pattern  was  dependent  upon  the  size  of  the  tiles  and  the  particular  mortar 
width.  The  overall  shape  of  the  patterns  were  made  rectangular  by  cropping  the  left  and  right  vertical 
borders.  Cropping  primarily  insured  that  the  patterns  did  not  extend  beyond  the  central  foveal  region  (the 
overall  patterns  subtended  about  28 '  by  43  ').  It  served  to  remove  a  distracting  effect  wherein  the  central  row 
would  otherwise  protrude  past  the  top  and  bottom  rows  by  a  half-tile,  forming  an  overall  arrow-like 
configuration  pointing  to  the  left  or  right. 

Procedure'.  Using  a  forced-choice  design,  subjects  viewed  Caff  Wall  stimuli  presented  for  200  msec. 
Subjects  were  told  to  fixate  and  attend  to  the  center  of  the  pattern.  It  was  their  task  to  decide  the  direction  of 
apparent  convergence  in  the  pattern,  i.e.  whether  the  vertical  separation  between  the  two  mortar  lines 
appeared  to  narrow  towards  the  left  versus  right  side  the  pattern.  They  indicated  their  choice  by  pressing  a 
left  or  right  labeled  button;  no  feedback  was  given  regarding  their  responses.  The  patterns  were  presented 
randomly  as  a  mirror  reflection  of  the  original  pattern.  Overall,  120  displays  were  presented  in  random 
order,  corresponding  to  3  repetitions  of  combinations  of  the  10  mortar  widths,  2  sizes  of  tile,  and  2 
conditions  of  minor  reflection. 

Results  and  Discussion 

The  impression  of  overall  tilt  in  the  Caff  Wall  pattern  reversed  orientation  for  all  subjects  at  approximately 
the  same  mortar  width.  This  result  is  plotted  for  both  tile  sizes  in  figure  6.  Both  curves  are  sigmoidal  with 
a  zero-crossing  in  the  vicinity  of  1.75'  mortar  width.  There  was  no  significant  effect  of  tile  size  on  the 
reversal  effect.  In  each  case  larger  standard  deviations  w  »e  found  for  the  larger  mortar  widths,  consistent 
with  the  subjects'  reported  impression  that  the  reversal  effect  was  weaker  than  the  original  Caff  wall 
illusion. 

Numeric  convolution  of  these  patterns,  for  correspondingly  scaled  mortar  widths  and  DOG  sizes, 
reflect  these  results.  In  figure  4  the  pattern  dimensions  in  the  experimental  stimuli  are  scaled  so  that  the 
DOG  (corresponded  to  the  smallest  physiologically-predicted  DOG  of  LS")  had  a  central  excitatory  diameter 
of  1 1  pixels.  The  7.1'  tiles  of  the  pattern  correspond  to  60  pixels  in  the  scaled  pattern.  The  DOG  operator 
was  zero-balanced  (the  amount  of  inhibition  equaled  the  amount  of  excitation),  and  had  a  ratio  of  center  to 
surround  space  constants  of  1:5.  Figure  7a  corresponds  to  the  convolution  of  a  Caff  Wall  pattern  with 
narrow  mortar,  which  induces  the  traditional  effect;  figure  7b  shows  the  convolution  for  a  larger  mortar 
width,  where  the  reversal  is  observed. 

It  is  noteworth-'  that  the  induction  effects  in  these  stimuli  are  so  strongly  scale  dependent.  Since 
the  data  were  gathered  for  mortar  widths  near  the  size  of  the  smallest  predicted  physiological  DOG  operator, 
the  observed  transition  point  for  the  reversal  (1.75')  corroborates  the  prediction  that  the  smalle.st  operator 
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has  a  central  excitatory  diameter  of  about  1.3'  (Marr  et  al.  1980).  Furthermore,  the  observed  brightness 
inducdon-like  effect  is  consistent  with  Morgan  and  Moulden's  (1986)  conclusion  that  it  is  a  consequence  of 
bandpass  filtering. 


CRFE  URLL  REVERSAL 


Figure  6.  Graphs  of  tilt  direction  judgements,  for  small  dies  in  a  and  large  dies  in  b.  In  each  case  subjects 
experienced  a  reversal  in  the  direcdon  of  apparent  convergence  for  mortar  approximately  1.75'. 


With  reference  to  figure  7a,  a  mortar  width  of  9  pixels  (equivalent  to  1  .O'),  results  in  a  convolution 
where  the  dark  and  light  bands  of  Fraser's  twisted  cords  are  apparent  in  the  mortar,  in  accordance  with 
Morgan  and  Moulden's  (1986)  observadon.  The  light  ridges  are  dlted  slighdy  from  the  horizontal  and  span 
diagonally  across  the  mortar  to  connect  the  comers  of  the  white  dies  above  and  below  the  mortar. 
Likewise,  the  dark  ridges  in  the  mortar  span  diagonally  between  comers  of  the  black  dies.  For  a  wider 
mortar  of  18  pixels  (equivalent  to  2.0'),  and  convoludon  with  the  same  diameter  EXDG,  the  tradidonal 
twisted  cord  arrangement  of  bands  is  exdnguished,  and  replaced  by  a  fainter  inducdon  effect.  Light  bands 
appear  now  span  the  mortar  between  the  black  dies  (and  likewise  dark  bands  between  the  white  dies). 
These  inducdon  bands  or  bars  are  seen  at  a  steeper  angle  to  the  horizontal  than  those  induced  at  smaller 
mortar  widths.  As  mortar  widths  is  further  increased  the  inducdon  effects  spanning  the  mortar  is  further 
diminished  and  becomes  negligible  for  mortar  widths  about  twice  the  size  of  the  DOG  operator. 
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Figure  7.  IXX3  convolution  of  a  Caf6  Wall  pattern  where  the  mortar  is  narrower  (in  a)  and  wider  (in  b)  than 
the  DOG  central  excitatory  diameter.  Numeric  values  in  c  and  d  correspond  to  regions  indicated  by 
rectangles  in  a  and  b. 


The  Influence  of  Elongated  Receptive  Fields 

A  DOG  operator  of  appropriate  scale  produces  a  distribution  of  convolution  values  that  is  distorted  in  a 
manner  that  resembles  the  apparent  twisted  cords.  Specifically,  the  geometry  of  the  convolution  values, 
particularly  the  induction  of  diagonal  gradients  into  the  mortar  corresponds  well  with  the  impressions  of 
illusory  tilt.  But  how  is  the  geometry  detected?  The  ridges  and  troughs  Morgan  and  Moulden  (1986) 


21 


discuss  are  initially  implicit  in  the  DOG-convolved  image.  While  there  is  rather  broad  consensus  on  how 
to  model  the  circular-symmetric  operators  at  the  retina,  there  is  no  correspondingly  quantitative  model 
available  of  the  measurement  and  encoding  of  tilt  It  is  generally  expected,  but  still  conjectural,  that 
simple  cells  perform  the  encoding,  for  example. 

There  are  many  distinct  ways  to  compute  the  orientation  of  a  ridge  or  trough  within  a  two- 
dimensional  field  or  array  of  values.  One  approach  is  to  locate  the  points  of  instantaneous  local  maximum 
or  minimum  according  to  some  spadal  criterion  and  to  subsequently  fit  a  line  or  curve  through  these  points. 
Watt  and  Morgan  (198S)  propose  measuring  the  central  moments,  however  their  proposal  is  restricted  to  a 
one-dimensional  signal;  the  extension  to  an  arbitrary  two-dimensional  signal  is  not  straightforward.  But 
where  the  luminance  signal  is  approximately  one-dimensional,  such  as  midway  along  a  ridge-like  strand  of 
the  twisted  cord  the  luminance  signal  could  be  approximated  as  a  bar.  The  orientation  of  the  ridge  or  trough 
would  be  the  local  tangent  to  the  locus  of  local  maximum  or  minimum  activity  (i.e.  zero-bounded  regions 
of  activity).  Another  computationally  distinct  method  is  to  summate  the  convolution  values  within 
elongated  (orientation-selective)  receptive  fields  of  varying  orientation,  and  to  select  that  receptive  field 
having  maximum  response  (e.g.  Tyler  &  Nakayama  1984).  Clearly  this  latter  computation  would  benefit 
from  using  antagonistic  even-symmetric  subfields,  so  that  the  operator  would  produce  zero  net  response  in  a 
constant  field.  The  relationship  between  this  computational  method  and  cortical  receptive  fields  is  rather 
apparent  The  former  method,  based  on  a  local  extremum  (or  centroid)  operator,  is  less  easily  related  to 
neurophysiology. 

Since  simple  and  complex  cells  of  the  striate  cortex  are  the  first  cells  in  the  visual  pathway  to 
demonstrate  orientation  selectivity  (Hubei  &  Wiesel  1962),  it  is  reasonable  to  suspect  that  they  are  involved 
in  the  determination  of  the  orientation  of  Fraser's  twisted  cords  and  the  mortar  line  (Morgan  &  Moulden 
1986).  The  following  experiment  further  supports  the  suggestion  that  striate  cells  underlie  the  orientation 
measurement 


Experiment  2 

Hammond  and  MacKay  (1981,  1983a)  have  shown  that  length  summation  in  simple  cells  is  highly 
dependent  on  contrast  along  the  axis  of  the  receptive  field.  For  stimuli  having  constant  contrast  along  their 
lengths,  length  summation  was  found  to  be  substantially  linear.  Additions  of  line  gaps  of  reduced 
(background)  contrast  lowered  the  response  of  the  cell  by  an  amount  predictable  by  length  summation 
considerations.  However,  reduction  of  contrast  below  background  (reversed  contrast)  produced  an 
unpredictably  large  response  decrement.  The  term  "gating"  inhibition  was  given  to  this  phenomenon  to 


distinguish  it  from  simple  removal  of  excitatory  drive,  and  appears  to  be  a  property  of  complex  cells  as  well 
(Hammond  &  MacKay  1983b,  198S).  Brookes  and  Stevens  (1988)  have  used  this  effect  to  examine  the  role 
of  striate  cells  in  gating  the  signalling  of  orientation  in  groups  of  dots.  We  have  modified  the  Caf6  Wall 
patterns  by  adding  tiny  dots  of  opposite  contrast  to  Fraser's  twisted  cords.  The  prediction  is  that  these  dots 
would  gate-off  the  units  that  would  otherwise  assert  the  orientation  of  the  cords  in  the  unmodified  pattern. 

Method 

Subjects:  Forty  naive  subjects  participated  in  this  experiment.  All  subjects  had  normal  or  corrected  to 
normal  visual  acuity. 

Stimuli:  Two  different  displays  (stimuli  1  and  2)  were  constructed  using  the  same  apparatus  used  in  the 
previous  experiment.  For  both  stimuli  the  background  was  a  neutral  grey  that  matched  the  luminance  of 
the  mortar  line  (8.2  ft/L).  The  individual  Caf6  Wall  patterns  in  both  types  of  display  were  composed  of 
seven  columns  by  six  rows  of  alternating  black  and  white  tiles,  each  10.3'  on  a  side.  Each  Cafd  Wall 
pattern  subtended  72.1'  horizontally  by  61.8'  vertically.  The  width  of  the  mortar  in  all  patterns  was  1'. 
The  luminance  of  the  white  tiles  was  15.2  ft/L  and  the  black  tiles  was  4.1  ft/L.  The  Michelson  contrast 
across  the  tiles  was  57.5%. 

Some  Caf6  Wall  patterns  had  white  or  black  dots  positioned  in  the  mortar.  Each  dot  was  a 
rectangle  1'  by  1.5'  and  positioned  at  the  center  of  the  region  of  overlap  between  tiles  of  similar  contrast. 
Stimulus  1  presented  a  traditional  Caf6  Wall  pattern  and  two  variants  on  the  basic  pattern  induced  by  the 
included  dots,  so  that  subjects  could  rank-order  the  relative  strength  of  the  illusions  as  a  function  of  the 
variation  we  introduced.  Stimulus  2,  shown  in  a  subsequent  series  of  presentations,  provided  a  follow-on  to 
the  basic  result  provided  by  stimulus  1. 

Stimulus  1  was  arranged  as  in  figure  8  and  contained  three  Caf6  Wall  patterns  (la,  lb  and  Ic). 
Cafd  Wall  pattern  la  contained  ^rtar  dots  of  the  same  contrast  sign  as  the  tiles  above  and  below  them.  The 
white  dots  had  a  luminance  of  22.2  ft/L  and  the  black  dots  4. 1  ft/L.  The  Michelson  contrast  between  the 
white  dots  and  the  white  tiles  was  18,5%.  The  Michelson  contrast  between  the  black  dots  and  the  black 
tiles  was  11.8%.  Caf6  Wall  pattern  lb  used  mortar  dots  of  opposite  contrast  compared  to  those  in  pattern 
la.  The  luminance  of  the  white  and  black  dots  was  the  same  as  pattern  la.  The  Michelson  contrast 
between  the  white  dots  and  the  black  tiles  was  68.7%  and  between  the  black  dots  and  the  white  tiles  was 
90%.  Cafd  Wall  pattern  Ic  contained  no  dots. 
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Figure  8.  Stimulus  1  of  experiment  2. 
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Figure  9.  Stimulus  2  of  experiment  2. 
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Stimulus  (2),  as  shown  in  figure  9,  was  composed  of  two  Cafd  Wall  patterns  (2a  and  2b). The  Cafd  Wall 
pattern  (2a)  was  the  conventional  pattern  and  contained  no  dots,  while  pattern  (2b)  had  dots  in  the  opposite 
contrast  configuration.  The  luminance  of  the  black  and  white  dots  equalled  that  of  the  black  and  white  tiles. 
The  Michelson  contrast  between  the  white  dots  and  the  black  tiles  was  51.5%  and  between  the  black  dots 
and  the  white  tiles  was  61.7%. 

Procedure:  Twenty  subject  viewed  stimulus  1  and  twenty  different  subjects  viewed  stimulus  2.  Each  subject 
was  seated  120  cm  from  the  stimulus  display.  To  introduce  the  illusion,  each  subject  was  first  shown  the 
Zdllner  illusion  and  asked  to  describe  the  figure.  When  the  subject  mentioned  the  apparent  convergence  of 
the  parallel  lines,  it  was  explained  that  convergence  of  the  mortar  lines  in  the  Cafd  stimuli  would  similarly 
be  seen,  to  varying  degrees.  Next  the  subject  observed  the  stimulus  (1  or  2)  and  was  instructed  to  rank  order 
apparent  convergence  of  the  mortar  lines  in  the  presented  Caf6  Wall  patterns.  Each  subject  was  given  as 
much  time  as  necessary,  but  asked  to  perform  the  task  quickly  and  to  rely  primarily  on  initial  impressions. 


Table  1:  Stimulus  1 
Rank  orderings 


Case 

Most 

Mid 

Least 

Total 

A 

0 

0 

20 

20 

B 

16 

4 

0 

20 

C 

4 

16 

0 

20 

Total 

20 

20 

20 

Table  2:  Stimulus  2 
Rank  orderings 


Case 

Most 

Least 

Total 

A 

0 

20 

20 

B 

20 

0 

20 

Total 


20 


20 
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Results  and  Discussion 

Table  1  shows  the  rank-ordering  of  frequency  data  for  the  three  different  versions  of  the  Caf6  Wall  in 
stimulus  ! .  Chi-squared  analysis  showed  that  the  null  hypothesis  (i.e.  the  convergence  strength  of  each 
pattern  is  equal)  could  be  rejected  (x2  =  81.5,  p  <  .001,  d.f.  =  4).  A  ^2  of  81.5  demonstrates  a  highly 
significant  effect  Notice  that  Caf6  Wall  pattern  (la),  the  pattern  with  dots  of  like  contrast  always  ranked  as 
having  the  weakest  convergence  illusion.  Caf6  Wall  pattern  (lb),  the  pattern  with  dots  of  opposite  contrast, 
was  ranked  as  the  strongest  80%  of  the  time.  T^le  2  show  the  rank-ordered  data  for  stimulus  two.  A 
strong  effect  is  clear.  Chi-squared  analysis  showed  that  the  null  hypothesis  could  be  rejected  (x2  =  40,  p  < 
.(X)l,  d.f.  =  1).  Notice  that  lowering  the  contrast  of  the  dots  resulted  in  a  perfect  preference  ordering. 

With  the  addition  of  dots  of  opposite  contrast  to  the  center  of  Fraser's  twisted  cords,  the  overall 
impression  of  tilt  of  the  mortar  lines  was  significantly  weakened.  In  addition,  the  presence  of  a  dot  of  like 
contrast  but  higher  luminance  enhanced  the  impression  of  tilt.  McCourt  (1983)  produced  similar  effects  on 
apparent  convergence  by  manipulating  the  contrast  sign  along  extended  segments  of  the  mortar.  His 
stimuli  were  designed  to  distinguish  between  the  contributions  of  brightness  induction  and  those  of  Gregory 
and  Heard's  (1979)  border  locking  notion.  McCourt  found  that  the  Cafd  Wall  convergence  effect  is 
weakened  when  the  mortar  is  darkened  in  the  segments  bounded  above  and  below  by  black  tiles  and 
lightened  where  bounded  by  white  tiles  (leaving  neutral  grey  in  the  transition  regions  where  tiles  have 
opposite  contrast).  While  this  reduces  the  potential  for  brightness  induction  in  those  regions,  it  also 
removes  the  bar-like  features  that  provide  (piecewise)  contour  continuity  along  the  mortar.  In  our  stimuli 
the  introduction  of  point-like  contrast  reversals  are  likewise  highly  effective  in  disrupting  the  contour 
organization  along  the  mortar.  The  facilitatory  effect  provided  by  point-like  contrast  enhancement:  is 
similarly  consistent  with  McCourt's  (1983)  findings  using  extended  mortar  segments.  While  global 
continuity  of  contrast  along  the  mortar  is  not  necessary,  as  Fraser  (1908)  originally  observed,  continuity  of 
contrast  within  the  individual  bar-like  segments  that  comprise  the  braided  strands  of  the  twisted  cord  is 
important.  That  is,  to  disrupt  the  overall  effect  it  is  sufficient  to  disrupt  the  conuast  continuity  within  the 
individual  white  or  black  bar-like  segments  (those  bounded  above  and  below  by  tiles  of  same  contrast).  As 
Hammond  and  MacKay  found,  contrast  reversal  along  the  length  of  the  bar  is  much  more  potent  than  a 
simple  gap.  The  results  of  experiment  2  thus  suggests  that  mechanisms  with  a  gating  non-linearity  similar 
to  that  observed  in  striate  cells  contribute  to  the  piecewise  measurement  of  tilt  and  underlie  their  integration 
along  continuous  contours. 
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2.3  Asserting  Orientation  between  Discrete  Items 

One  can  readily  see  structure  in  patterns  made  up  of  discrete  points  where  the  structure  comes  from  the 
spatial  relationships  of  the  points  in  the  pattern.  For  example,  figure  10  shows  a  simple  dot  pattern  known 
as  a  Glass  pattern  (Glass  1969).  Glass  patterns  are  constructed  by  superimposing  onto  a  random  dot  pattern 
a  copy  that  has  been  transformed,  e.g.  by  scaling  or  rotation.  Each  dot  and  its  transformed  counterpart  in 
the  sup^mposed  copy  defines  a  dot  pair.  The  radial  pattern  in  figure  10,  for  example,  is  the  result  of  a 
scaling  transformation.  In  order  for  the  global  organization  to  emerge  from  such  patterns  the  orientation 
corresponding  to  the  lines  defmed  by  the  dot  pairs  must  be  found. 


Figure  10.  A  Glass  pattern  used  for  examining  the  detection  of  local  pairing  orientation. 


There  is  no  general  consensus  on  how  this  orientation  is  extracted.  Several  competing  theories 
have  been  proposed,  differing  primarily  in  terms  of  the  extent  to  which  the  component  dots  are  treated  as 
individual  elements.  The  Gestalt  investigators  described  perceptual  grouping  in  terms  of  individually 
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maiked  items  that  are  grouped  explicitly  on  the  basis  of  their  attributes,  proximity  and  their  spatial 
arrangement  The  notion  of  establishing  discrete  grouping  tokens  was  tacit,  by  and  large,  in  the  Gestalt 
demonstrations  of  similarity  grouping  (Wertheimer  1923;  Koehler  1929;  Koffka  1935).  Later,  it  was 
specifically  proposed  that  groupings  involved  "place  markers"  (Attneave  1974)  or  "place  tokens"  Marr 
(1976, 1982),  which  individually  carry  information  about  position  and  attributes  such  as  contrast,  color, 
size  and  orientation  (see  also  Ullman's  (1979)  grouping  tokens  for  motion  correspondence).  Pairings 
between  adjacent  place  tokens  are  represented  by  "virtual  lines"  (Attneave  1974;  Mart  1976, 1982;  Stevens 
1978).  Similarly,  Caelli  and  Julesz  (1978)  discuss  "local  dipoles"  between  neighboring  dots  in  texture 
discrimination.  Virtual  lines  are  not  illusory  lines  in  the  sense  that  subjective  contours  are  illusory 
intensity  edges;  they  do  not  exhibit  contrast  phenomena.  Instead,  perceived  pairings  make  explicit  the 
orientation,  position,  and  separation  of  the  similar  and  adjacent  elements.  Atmeave  (1955,  1974)  has 
demonstrated  position  and  orientation  judgements  that  seem  mediated  by  place  tokens,  and  Beck  and 
HaUoran  (1985)  similarly  suggest  that  virtual  lines  underlie  some  vernier  acuity  judgments.  It  is  not  clear, 
however,  whether  the  position  markers  in  attentive  judgments  of  relative  position  and  orientation  under 
focal  scrutiny  are  the  same  "place  tokens"  that  have  been  proposed  for  early  visual  processing  of  texture. 

The  alternative  model  for  grouping  is  that  the  individual  dots  contribute  to  the  local  orientation 
only  implicitly,  in  terms  of  their  spatial  proximity  and  their  luminance  energy.  A  closely-spaced  pair  of 
dots,  or  a  chain  of  collinear  dots,  has  a  power  spectrum  similar  to  that  of  an  isolated  line  segment  for 
spatial  frequencies  less  than  1/s,  where  s  is  the  dot  spacing.  Low  spatial  frequency  filtering  would  therefore 
provide  a  means  for  extracting  the  local  orientation  signal,  presumably  using  cortical  ceUs  tuned  to  both 
orientation  and  spatial  frequency  (Ginsberg,  1973).  Recently,  a  model  for  dot  grouping,  based  on  Gaussian 
blur,  has  also  been  proposed  that  shares  similarities  with  earlier  models  operating  in  the  luminance  domain 
(Smits  &  Vos  1986). 

The  even-symmetric  cortical  simple  cell  has  been  specifically  proposed  as  responsible  for  the  local 
orientation  measurement:  the  dot  pair  would  be  roughly  equivalent  to  a  continuous  line  of  equal  total  energy 
presuming  a  linearly  summating  receptive  field.  The  structure  of  the  dot  pattern  would  emerge  without  need 
for  explicitly  marking  the  constituent  dots  as  tokens.  Glass  proposes  that  the  local  orientation  is  derived  by 
correlating  the  activity  of  simple  cells  over  small  neighborhoods  (Glass  1969;  Glass  &  Perez  1973;  Glass 
&  Switkes  1976;  Glass  1979).  Zucker  (1983)  proposes  a  cooperative  computation  whereby  the  broad 
orientation  tuning  curves  of  individual  receptive  fields  can  be  sharpened  by  combining  the  outputs  of 
individual  cells  over  local  neighborhoods.  Simple  cells  whose  receptive  fields  are  oriented  with  the  dot 
pairs  would  presumably  respond  more  vigorously,  on  average,  than  those  cells  at  other  orientations,  so  that 
local  correlation  (or  a  similar  computation)  of  their  activity  would  reveal  the  orientation  of  the  dot  pairs  in 
each  vicinity.  Similarly,  Caelli  and  Julesz  (1978)  suggest  that  linear  arrangements  of  dots  in  texture  are 
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detected  by  neural  units  with  elongated  receptive  fields  applied  to  the  retinal  image,  either  "a  single  neural 
feature  extractor  of  the  Hubei  and  Wiesel  type",  or  a  unit  that  "measures  the  quasicollinearity  of  adjacent 
dipoles  by  combining  single  neural  units  of  a  retinal  neighborhood  with  slightly  different  orientation 
sensitivity"  (CaeUi  &  Julesz  1978,  p.  172;  see  dso  Caelli  et  al.  1978;  Julesz  1981).  Prazdny  (1984)  also 
suggested  that  the  dot  pairings  in  Glass  patterns  are  detected  by  ”...  measurements  in  the  spatial  and  energy 
domain  rather  than  logical  operations  on  symbolic  descriptions"  (see  also  [Prazdny  1986]). 

The  simple  cell  model,  while  attractive  in  its  ability  to  relate  a  grouping  effect  to  underlying 
neural  mechanisms,  seemingly  cannot  account  for  certain  effects.  Two  effects  merit  discussion,  the  fust 
concerning  spatial  frequency  filtering  and  the  second  grouping  on  the  basis  of  contrast  and  color  similarity. 
Both  effects,  which  are  contrary  to  the  simple  cell  model,  are  reviewed  in  the  following. 

In  (Stevens  &  Brookes  1987)  we  present^  various  arguments  against  using  existing  models  of 
simple  cells  for  dot  grouping.  The  first  of  these  arguments  concons  the  theory  that  organization  is  carried 
exclusively  by  low  spatial  frequencies.  It  has  been  shown  by  Carlson  et  al.  (1980)  and  Janez  (1984)  that 
dotted-line  organization  can  be  seen  in  high-pass  spatial  frequency  filtered  patterns.  They  conclude  that 
some  process  "more  abstract"  than  low  spatial  frequency-tuned  channels  is  involved.  In  a  similar  paradigm, 
we  generated  dot  patterns  with  only  high  spatial  frequency  content  (Stevens  &  Brookes  1987).  The 
individual  items  were  3x3  pixels  on  a  side,  each  a  black-and-white  checkerboard  with  the  center  pixel  the 
same  grey  value  as  the  background  grey.  The  argument  against  simple  cells  is  that  the  stimuli  have 
insignificant  power  in  the  range  of  spatial  frequencies  at  which  a  correspondingly  scaled  simple  cell  would 
respond.  Using  energy  balanced  checkerboards  we  found  that  the  impression  of  local  pairings  and 
parallelism  is  apparent  when  the  correlated  dots  were  separated  by  as  much  as  30  arc  min  which  is  beyond 
the  range  that  simple  cells  are  believed  to  exist  foveally.  In  a  similar  experimental  design,  Prazdny  (1984) 
reported,  contrary  to  our  finding,  that  when  a  Glass  pauern  is  composed  of  individual  Laplacian-like  dots 
(e.g.  a  central  white  point  surrounded  by  a  ring  of  darker  points)  that  the  apparent  organization  of  the  pattern 
is  lost  when  the  mean  luminance  of  the  Laplacian-like  dots  match  the  background  grey.  We  likewise  found 
this  effect,  but  attribute  the  problem  to  the  weak  energy  of  the  tiny  Laplacian-like  features  used,  particularly 
when  presented  in  the  parafovea  where  they  could  barely  be  resolved.  When  we  scaled  the  Laplacian-like 
features  linearly  with  eccentricity,  we  found  that  the  organization  of  the  Glass  pattern  was  successfully 
preserved  when  the  mean  luminance  of  the  individual  features  matched  the  background  (Stevens  &  Brookes 
1987).  This  evidence,  at  the  time,  convinced  us  of  a  non-simple  cell  conuibution  to  the  preattentive 
extraction  of  structure  among  tokens  in  texture. 

The  second  line  of  evidence  against  the  simple  cell  model  for  dot  grouping  concerns  the  preference 
for  similarity  among  the  items  that  are  grouped.  Similarity  preference,  first  described  by  as  a  Gestalt 
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organizing  principle,  was  demonstrated  by  Stevens  (1978)  in  rivalrous  Glass  patterns.  Dot  triples  were 
presented  rather  than  mere  dot  pairs,  where  two  of  the  three  dots  were  of  similar  contrast  but  dimmer  than 
the  third.  Contrary  to  contrast  summation  models,  observers  saw  the  organization  carried  by  pairings  of 
low  contrast  dots.  This  general  finding  was  replicated  and  extended  to  color  similarity  in  (Stevens  & 
Bnx)kes  1987),  further  strengthening  the  argument  fcx^  token-based  grouping  in  texture,  rather  than  a  process 
that  operates  purely  in  the  luminance-energy  domain. 

The  above  two  lines  of  argument  against  simple  cells  constituting  the  underlying  grouping 
mechanism  are  ultimately  based  on  assumptions  about  the  linearity  with  which  contrast  energy  is 
summaied  by  simple  cells  (Hubei  &  Wiesel,  1962;  Bishop,  Coombs  &  Henry,  1971;  Bishop  &  Henry, 
1972).  The  major  deviation  from  linearity  is  expected  in  length  and  width  summation  (Heggelund,  Krekling 
&  Skottun  1983;  Henry,  Goodwin  &  Bishop  1978;  Webster  &  DeValois  1985),  presumably  due  to  a 
Gaussian-shaped  weighting  envelope,  such  as  modelled  by  a  Gabor  filter  (Daugman  1985)  (see  also 
Movshon,  Thompson  &  Tolhurst  1978  regarding  superposition).  We  will  summarize  this  view  of  the 
simple  cell  as  assuming  linear-weighted  spatial  summation. 

Recent  neurophysiological  results  show  that  simple  cells  are  not  strictly  linear  summation  devices. 
Heggelund  et  al.  (1983)  showed  that  excitation  and  inhibition  in  simple  cells  varied  nonlinearly  across  the 
receptive  fields.  Their  results  indicated  that  there  is  some  overlap  of  the  excitatory  and  inhibitory  fields. 
Hammond  and  MacKay  (1983)  found  that  while  a  long  bar  will  stimulate  a  simple  cell,  the  same  bar,  with 
the  addition  of  a  point  of  opposite  contrast  anywhere  along  the  bar,  will  not.  The  effect  is  more  like  a 
logical  gating  than  summation  since  a  small  point  will  turn  off  the  response  of  even  a  very  long  bar.  This 
property  was  found  in  each  simple  cell  that  was  tested.  The  term  "gating"  inhibition  was  given  to  this 
phenomenon  to  distinguish  it  from  simple  removal  of  excitatory  drive,  and  appears  to  be  a  property  of 
complex  cells  as  well  (Hammond  &  MacKay  1983b,  1985). 

This  gating  property  of  simple  cells  provides  a  novel  way  of  testing  the  role  of  simple  cells  in  dot 
grouping  processes.  If  simple  cells  are  involved  in  grouping  then  we  should  be  able  to  nullify  that 
grouping  by  placing  a  point  of  opposite  contrast  between  each  pair  of  points  in  the  pattern.  This  is  not 
conclusive  evidence  that  simple  cells  are  the  mechanism  involved,  but  since  other  cells,  such  as  complex 
cells  were  not  found  to  have  this  property  it  would  constitute  very  strong  evidence.  The  following 
experiment  (Brookes  &  Stevens,  in  preparation)  shows  that  dot  groupings  can  in  fact  be  nullified  by 
exploiting  this  propeny. 
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Method 

Stinudi:  The  stimuli  consisted  of  three  types  of  modified  Glass  patterns,  each  made  up  of  triples  of  dots.  In 
each  case  the  triples  were  a  pair  of  white  dots  and  a  single  black  dot,  all  presented  against  a  grey 
background.  The  dot  pairs  were  oriented  either  in  radial,  concentric  or  random  orientation  relative  to  the 
pattern  center.  For  each  of  these  three  global  organizations  there  were  two  possible  positions  for  the 
opposite-contrast  (black)  dot  associated  with  each  pair  of  white  dots.  Manipulating  the  position  of  the 
opposite-contrast  dot  was  intended  to  control  for  the  possibility  that  these  dots  might  simply  be  distractors 
or  noise.  In  the  first  case,  the  black  dot  was  placed  midway  between  the  two  white  dots  of  each  pair,  the 
analogue  to  the  stimuli  used  by  Hammond  and  MacKay.  In  the  second,  the  black  dot  was  placed  adjacent  to 
the  pair  at  a  distance  of  half  the  spacing  of  the  pair.  In  this  control  case  the  opposite-contrast  dots  were 
displaced  to  lie  adjacent  to,  rather  than  between,  the  paired  dots  of  like  contrast  The  separation  between  dot 
triples  was  approximately  34'  with  the  individual  dot  pairs  separated  by  approximately  9'.  Each  Glass 
pattern  was  constructed  from  an  underlying  set  of  points  of  homogeneous  density  with  no  discernible 
structure  (see  Stevens  1978  for  method).  Thus  the  overall  pattern  consisted  of  an  organized  collection  of  dot 
triples,  each  consisting  of  two  white  dots  in  either  radial,  concentric  or  random  organization,  and  the  third 
black  dot  in  either  the  between  or  adjacent  position  relative  to  the  pair.  The  stimuli  are  shown  in  figure 
11.  The  stimuli  were  generated  by  a  Symbolics  3675  Lisp  Machine  and  displayed  on  a  Tektronix  690SR 
monitor. 

Procedure:  Eight  subjects  participated.  All  but  one  were  naive  to  the  purpose  of  the  experiment.  The 
subjects  were  seated  2m  from  the  stimulus  display.  A  trial  consisted  of  the  presentation  of  one  of  the  six 
conditions  (three  types  of  dot  patterns  with  2  possible  positions  of  black  dots  each)  chosen  randomly,  and 
the  task  was  to  decide  which  of  the  three  patterns  was  present  Initially,  a  small  fixation  cross  was 
presented  for  I  sec,  after  which  the  dot  stimulus  was  presented  for  200  msec  without  masking.  Subjects 
were  to  respond  to  this  stimulus  by  pressing  one  of  three  buttons  on  a  mouse  pointing  device.  Each  was 
given  20  repetitions  of  the  6  stimuli,  for  a  total  of  120  trials. 

Results 

Table  3  shows  the  data  for  the  eight  subjects.  In  most  cases  there  was  a  strong  tendency  to  see  the  pattern 
correctly  for  stimuli  in  which  the  opposite  contrast  dot  appeared  adjacent  to  the  dot  pairs.  In  this  case  the 
mean  percentage  of  correct  responses  was  94%  for  the  concentric  case,  78%  for  the  radial  case  and  92%  for 
the  random  case.  For  the  condition  where  the  opposite  contrast  dot  appeared  between  the  paired  dots  these 
means  were  lowered  to  28%  for  the  concenuic,  31%  for  the  radial.  For  the  random  case  the  mean  in  this 
case  was  85%. 
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Figure  11.  Experimental  stimuli  for  examining  effect  of  contrast  reversal  in  dot  pairing. 


An  ANOVA  was  performed  to  test  the  significance  of  this  data..  The  dot-between  and  dot-adjacent 
cases  were  found  to  be  significantly  different  (F(7,  1)  =  37.3).  There  was  also  a  significant  difference 
between  Glass  pattern  types  (F(14,  2)  =  1 1.8)  indicating  that  the  difference  was  due  to  the  concentric  and 
radial  patterns.  There  also  appears  to  be  a  secondary  effect  for  most  subjects.  In  many  instances  of  the  dot- 
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between  condition  induced  a  reversal,  where  subjects  indicated  the  radial  pattern  when  the  concentric  pattern 
was  presented  and  the  concentric  pattern  when  the  radial  pattern  was  presented.  In  some  cases  this 
constituted  an  almost  complete  perceptual  reversal.  In  others  simply  a  greater  number  of  responses  in  the 
direction  of  the  reversal  than  for  the  correct  pattern. 

Discussion 

Despite  the  earlier  evidence  against  the  model  that  dot  groupings  are  performed  by  simple  cells,  this 
experiment  provides,  we  believe,  strong  evidence  that  implicate  their  involvement,  in  much  the  manner 
originally  proposed  by  Glass  and  Prazdny.  The  extinction  of  apparent  organization  among  the  white  dots 
by  the  inclusion  of  black  dots  at  the  midpoint  between  each  pairing  is  consistent  with  the  gating 
nonlinearity  fmding  reported  earlier  and,  as  yet,  has  not  been  associated  with  any  other  mechanism. 

In  addition,  there  is  a  secondary  effect  of  apparent  reversal,  such  that  a  radial  Glass  pattern  may 
appear  concentric  when  the  opposite  contrast  dots  are  placed  between  the  original  dot  pairs,  and  vice  versa, 
for  most  observers  we  tested.  This  shows  that  is  there  not  only  a  gating  or  suppression  of  the  output  of 
striate  cells  by  the  opposing  contrast  but  also  a  facilitation  of  the  perpendicular  orientation.  This  is  similar 
to  the  subde  effect  shown  in  (Glass  &  Switkes  1976)  in  which  paired  black  and  white  dots  are  perceived  as 
having  a  roughly  orthogonal  organization  to  that  suggested  by  the  pairings.  Glass  (1979)  proposes  that 
those  simple  cells  positioned  between  the  black  and  white  dots  and  oriented  perpendicularly  to  the  dot  pair 
may  be  weakly  stimulated.  While  perpendicularly  oriented  receptive  fields  may  contribute  to  the  phantom 
impression  of  orthogonal  orientation,  we  should  note  that  more  is  likely  involved  since  it  the  experiment 
we  found  that  few  subjects  experienced  complete  reversals  in  apparent  organization.  But  overall,  we 
conclude  that  manipulation  of  apparent  organization  by  the  position  of  the  opposite  contrast  dots  is  strong 
evidence  implicating  contrast-summation  receptive  fields  in  the  extraction  of  the  local  orientation 
mechanism. 

We  are  still  faced  with  the  rather  convincing  arguments  reported  earlier  against  simple  cells  being 
the  mechanism  used  for  grouping  discrete  points.  These,  recall  were  the  demonstrations  of  apparent 
organization  in  energy-balanced  dot  patterns  (e.g.  made  of  Laplacian-shaped  luminance  features)  and  the 
observed  preference  for  similarity  in  dot  groupings. 
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Table  3.  Raw  data  for  experiment.  Rows  represent  responses  for  a  particular  pattern. 
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We  have  already  cited  neurophysiological  evidence  that  simple  cells  do  not  perform  strict  linear 
summation  when  presented  with  opposite  contrast  within  their  excitatory  subfields.  The  energy-balanced 
dot  pattern  demonstrations  rested  on  a  similar  assumption  about  the  summation  properties  of  simple  cells. 
It  was  assumed  that  contrast  features  having  only  very  high  spatial  frequency  content  would  not  provide 
effective  input  to  receptive  fields  sufficiently  large  as  to  span  two  such  features.  While  apparently  valid  on 
the  basis  of  optimal  spatial  frequency  tuning  of  cortical  receptive  fields  as  a  function  of  their  size,  one 
cannot  rule  out  the  possibility  that  large  cortical  receptive  fields  receive  input  from  very  small  LGN  fields. 
If  that  were  the  case,  the  various  demonstrations  involving  high  spatial  frequency  stimuli  would  be  invalid. 

The  second  line  of  evidence  concerns  similarity  grouping.  We  believe  that  the  various 
demonstrations  of  similarity  preference  are  valid,  and  this  must  be  reconciled  with  the  underlying  role  of 
elongated  cortical  cells  in  detecting  or  encoding  the  pairing  orientation..  The  preference  for  color  similarity 
demonstrated  by  rivalrous  Glass  patterns  (Stevens  &.  Brookes  1987)  must  either  be  directly  attributed  to 
color  specificity  of  corresponding  cortical  receptive  fields  or  associated  with  later  selection  processes.  Since 
pairings  can  also  be  made  between  dots  of  similar  contrast,  at  least  part  of  the  selection  would  presumably 
be  performed  later. 

Prazdny  (1986)  suggests  that  simple  cells  detect  orientations  within  separate  feature  spaces.  His 
example  of  feature  spaces  is  the  separation  of  dark  and  light  points  due  to  the  separation  of  the  on  and  off 
cells  in  the  retina.  This  idea  may  account  for  many  of  the  properties  of  dot  grouping  if  we  broaden  the 
notion  of  feature  spaces  so  that  it  includes  each  of  the  features  that  seem  to  control  the  grouping.  Along 
with  each  of  these  feature  spaces  must  be  a  mechanism  tuned  to  that  property  so  that  it  may  be 
distinguished. 

2.4  Connecting  Contour  Fragments  across  Gaps 

One  of  the  first  problems  in  bootstrapping  forms  is  to  connect  fragments  of  the  boundary  that,  for  one 
reason  or  another,  are  disconnected.  It  seems  that  the  size  of  the  gap  can  have  a  great  effect  on  how  the 
constituent  parts  are  treated.  For  large  gaps  the  parts  are  seen  as  separate  units,  while  for  sufficiently  small 
gaps  the  parts  form  a  single  unit  which  in  some  ways  is  equivalent  to  a  continuous  contour.  The  critical 
size  of  the  gap  seems  to  scale  with  the  size  of  the  contour  fragments.  Some  of  the  problems  are:  what 
properties  are  retained  by  the  individual  pieces  when  they  are  connected,  which  are  lost,  and  what  are  the 
relationships  of  this  property  to  oriented  receptive  fields? 
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Within  a  critical  distance,  we  have  found  that  separate  pieces  of  contour  are  treated  as  single 
entities.  This  behavior  would  be  expected  at  the  limit  of  resolution  where  the  gap  may  simply  be  ignored, 
but  the  property  seems  to  scale  with  the  size  of  the  contour  segments. 

Recently,  Julesz  (1986)  found  that  there  are  two  critical  distances  in  texture  discrimination.  The 
first,  which  Julesz  calls  delta,  is  the  distance  between  texture  elements.  The  second,  epsilon,  is  the  distance 
between  the  pieces  that  make  up  the  individual  texture  elements.  The  significance  of  the  epsilon  distance  is 
that  texture  elements  within  epsilon  of  each  other  are  combined  into  a  single  element.  These  distances  are 
relative  to  the  size  of  the  texture  element  and  are  thus  similar  to  the  property  we  found  with  regard  to 
contour  aggregation.  An  example  of  this  property  is  shown  in  figure  12.  In  figure  12a  the  gap  between  the 
crosses  is  small  and  the  figure  is  seen  as  a  grid.  In  figure  12b  the  gap  is  greater  than  the  critical  distance  so 
the  figure  is  seen  as  a  collection  of  crosses.  In  figures  12c  and  12dthe  grids  have  been  doubled  in  size  with 
the  gaps  doubling  as  well.  In  figure  12c  the  grid  is  seen  while  in  figure  12d  the  crosses  are  seen.  Thus  the 
critical  distance  scales  with  the  size  of  the  crosses.  These  demonstrations,  of  course,  are  preliminary  and 
will  require  formal  experiments.  Tentatively,  however,  we  conclude  that  the  ability  to  group  across  gaps  is 
dependent  on  the  sizes  of  the  constituent  parts  in  the  grouping.  When  the  crosses  are  skewed  so  that  the 
ends  do  not  line  up  there  is  a  similar  critical  distance  for  the  gaps,  so  that  if  the  gap  is  smaller  a  grid  is 
seen,  and  if  larger,  the  crosses  are  seen  (figure  13).  This  distance  seems  to  be  smaller  than  that  of  the 
collinear  crosses,  but  careful  experiments  must  be  performed  before  this  is  known. 

Kulikowski  (1969)  has  shown  a  length-dependent  effect  on  contrast  threshold  for  detection  of  a 
straight  line  up  to  a  length  of  60  arc  min  (and  corresponding  width  summation  up  to  about  6  arc  min). 
This  60  arc  min  summation  area  is  presumed  to  be  the  overall  span  of  facilitatory  interaction  among 
collinear  subunits,  each  about  9  arc  min  in  length  (Andrews  1967a,  b;  Bacon  &  King-Smith  1977).  Thus, 
there  is  psychophysical  evidence  for  facilitation  across  collinear  receptive  fields.  This  detection  is  degraded 
by  gaps  more  than  expected  by  linear  summation  (Andrews  1967b;  Sakitt  1971).  However,  it  is  possible 
that  this  intolerance  to  gaps  has  no  relation  to  the  present  problem  since  our  stimuli  are  well  above  the 
detection  threshold. 

Recent  anatomical  and  physiological  work  supports  our  findings.  Gilbert  and  Wiesel  (1979,  1983) 
have  shown  that  single  striate  cortical  cells  have  dendrites  that  are  located  at  some  distance  from  the  cell 
body.  Ts’o  et  al.  (1986b)  demonstrated  that  interactions  between  neurons  in  different  cortical  columns  are 
constrained  by  orientation.  Thus,  stimuli  of  the  same  orientation  but  in  different  retinal  locations  mutually 
reinforce  activity  in  the  same  groups  of  neurons.  In  some  cases,  parallel  orientations  may  be  reinforced,  but 
in  others,  collinearity  may  be  signalled. 


36 


The  property  of  connecting  contour  fragments  into  a  single  unit  is  suggestive  of  the  notion  of  an 
"emergent  feature".  Pomerantz  et  al.  (1977)  suggested  under  certain  circumstances  discrete  items  can 
combine  into  single  elements  and  take  on  properties  that  the  individual  elements  did  not  posses  alone. 
Thus,  discrimination  between  two  elements  may  be  facilitated  when  an  emergent  feature  is  present  in  one 
element  but  not  in  the  other.  Another  way  to  state  this  is  that  discrimination  means  the  detection  of 
emergent  features. 

Treisman  (1984)  extended  this  work  by  showing  that  the  presence  of  an  emergent  feature  was 
sufficient  to  distinguish  an  object  from  a  field  of  objects  lacking  that  feature.  She  further  showed  chat  the 
emergent  property  was  distinct  from  the  individual  parts  that  formed  the  feature.  For  example,  when  the 
visual  system  is  overloaded  by  many  distractors  consisting  of  elements  in  the  shape  of  a  "V",  the  closure 
property  can  be  transferred  from  a  circle  to  a  conjunction  of  features  forming  an  illusory  triangle.  The 
independence  of  the  property  from  the  object  suggests  a  separate  level  or  map  at  which  the  emergent  feature 
is  being  detected. 

The  demonstration  above  in  figure  12  shows  that  even  though  the  gaps  between  the  crosses  are 
still  apparent  there  is  a  level  at  which  the  gaps  are  ignored,  resulting  in  a  larger  connected  form.  This  larger 
form  acts  as  an  emergent  feature,  which  can  be  demonstrated  in  the  same  manner  in  which  Treisman  (1984) 
demonstrated  other  emergent  features.  Treisman  showed  that  an  object  can  be  distinguished  preattentively 
from  a  field  of  similar  objects  if  the  given  object  differs  from  the  field  in  a  single  feature.  This  is  true  for 
primitive  features  as  well  as  emergent  features. 
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Figure  12.  In  a  and  c  the  gaps  are  small  and  the  figure 
the  figure  is  seen  as  separate  crosses. 
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Figure  13.  As  in  figure  12  a  is  seen  -as  a  grid  even  though  the  lines  are  skewed  and  b  is  seen  as  separate 


crosses. 
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